Can ICU be used for a database's default sort order?
I tried to arrange $subject via
create database icu encoding 'utf8' lc_ctype "en-US-x-icu" lc_collate "en-US-x-icu" template template0;
and got only
ERROR: invalid locale name: "en-US-x-icu"
which is unsurprising after looking into the code, because createdb()
checks those parameters with check_locale() which only knows about
libc-defined locale names.
Is there some way I'm missing, or is this just a not-done-yet feature?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 22, 2017 at 7:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Is there some way I'm missing, or is this just a not-done-yet feature?
It's a not-done-yet feature.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 6/22/17 23:10, Peter Geoghegan wrote:
On Thu, Jun 22, 2017 at 7:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Is there some way I'm missing, or is this just a not-done-yet feature?
It's a not-done-yet feature.
It's something I hope to address soon.
The main definitional challenge is how to associate a pg_database entry
with a collation.
What we currently effectively do is duplicate the fields of pg_collation
in pg_database. But I imagine over time we'll add more properties in
pg_collation, along with additional ALTER COLLATION commands etc., so
duplicating all of that would be a significant amount of code
complication and result in a puzzling user interface.
Ideally, I'd like to see CREATE DATABASE ... COLLATION "foo". But the
problem is of course that collations are per-database objects. Possible
solutions:
1) Associate by name only. That is, you can create a database with any
COLLATION "foo" that you want, and it's only checked when you first
connect to or do anything in the database.
2) Create shared collations. Then we'd need a way to manage having a
mix of shared and non-shared collations around.
There are significant pros and cons to all of these ideas. Some people
I talked to appeared to prefer the shared collations approach.
Other ideas?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 23, 2017 at 11:32 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
It's something I hope to address soon.
I hope you do. I think that we'd realize significant benefits by
having ICU become the defacto standard collation provider, that most
users get without even realizing it. As things stand, you have to make
a point of specifying an ICU collation as your per-column collation
within every CREATE TABLE. That's a significant barrier to adoption.
1) Associate by name only. That is, you can create a database with any
COLLATION "foo" that you want, and it's only checked when you first
connect to or do anything in the database.2) Create shared collations. Then we'd need a way to manage having a
mix of shared and non-shared collations around.There are significant pros and cons to all of these ideas. Some people
I talked to appeared to prefer the shared collations approach.
I strongly prefer the second approach. The only downside that occurs
to me is that that approach requires more code. Is there something
that I've missed?
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter Geoghegan <pg@bowt.ie> writes:
On Fri, Jun 23, 2017 at 11:32 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:1) Associate by name only. That is, you can create a database with any
COLLATION "foo" that you want, and it's only checked when you first
connect to or do anything in the database.2) Create shared collations. Then we'd need a way to manage having a
mix of shared and non-shared collations around.There are significant pros and cons to all of these ideas. Some people
I talked to appeared to prefer the shared collations approach.
I strongly prefer the second approach. The only downside that occurs
to me is that that approach requires more code. Is there something
that I've missed?
I'm not very clear on how you'd bootstrap template1 into anything
other than C locale in the second approach. With our existing
libc-based stuff, it's possible to define what the database's locale
is before there are any catalogs. It's not apparent how to do that with
a collation-based solution.
In my mind, collations are just a SQL-syntax wrapper for locales that
are really defined one level down. I think we'd be well advised to
carry that same approach into the database properties, because otherwise
we have circularities to deal with. So I'm imagining something more like
create database encoding 'utf8' lc_collate 'icu-en_US' lc_ctype ...
where lc_collate is just a string that we know how to interpret, the
same as now.
We could optionally reduce the amount of notation involved by merging the
lc_collate and lc_ctype parameters into one, say
create database encoding 'utf8' locale 'icu-en_US' ...
I'm not too clear on how this would play with other libc locale
functionality (lc_monetary and so on), but we'd have to deal with
that question anyway.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi.
23 июня 2017 г., в 21:32, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> написал(а):
On 6/22/17 23:10, Peter Geoghegan wrote:
On Thu, Jun 22, 2017 at 7:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Is there some way I'm missing, or is this just a not-done-yet feature?
It's a not-done-yet feature.
It's something I hope to address soon.
Will it work only for a particular database? Or for a whole cluster during initdb also? Any chance to get this done in 11?
--
May the force be with you…
https://simply.name
On 1/31/18 11:48, Vladimir Borodin wrote:
Will it work only for a particular database? Or for a whole cluster
during initdb also? Any chance to get this done in 11?
I'm currently not working on it.
It's basically just a lot of leg work, and you need to come up with a
catalog representation. Possible options have already been addressed in
earlier threads.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello!
1 февр. 2018 г., в 19:09, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> написал(а):
On 1/31/18 11:48, Vladimir Borodin wrote:
Will it work only for a particular database? Or for a whole cluster
during initdb also? Any chance to get this done in 11?I'm currently not working on it.
It's basically just a lot of leg work, and you need to come up with a
catalog representation. Possible options have already been addressed in
earlier threads.
I can try to do this before next CF. But ISTM that EDB and Postgres Pro already have various flavours of similar feature. Maybe they are planning to publish that?
Best regards, Andrey Borodin.
On Fri, Feb 2, 2018 at 4:22 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
I can try to do this before next CF. But ISTM that EDB and Postgres Pro already have various flavours of similar feature. Maybe they are planning to publish that?
I would definitely review that patch.
--
Peter Geoghegan
Hi!
2 февр. 2018 г., в 21:14, Peter Geoghegan <pg@bowt.ie> написал(а):
On Fri, Feb 2, 2018 at 4:22 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
I can try to do this before next CF. But ISTM that EDB and Postgres Pro already have various flavours of similar feature. Maybe they are planning to publish that?
I would definitely review that patch.
I've contacted Postgres Professional. Marina Polyakova had kindly provided their patch.
The patch allows to use libc locale with ICU collation as default for cluster or database.
It seems that this patch brings important long-awaited feature and deserves to be included in last v11 commitfest.
Peter, everyone, do you agree with this? Or should we better adapt this work through v12 cycle?
I'm planning to provide review asap and do necessary changes if required (this was discussed with Marina and Postgres Professional).
Best regards, Andrey Borodin.
Attachments:
0001-ICU-as-default-collation-provider.txttext/plain; name=0001-ICU-as-default-collation-provider.txt; x-unix-mode=0644Download
From e1cb130f550952d9c9c2d9ad1c52e60699a2c968 Mon Sep 17 00:00:00 2001
From: Marina Polyakova <m.polyakova@postgrespro.ru>
Date: Fri, 9 Feb 2018 18:57:25 +0300
Subject: [PATCH] ICU as default collation provider
Now you can choose the default collation provider - libc or icu (the latter is
available only if you build PostgreSQL using --with-icu). Just pass the
appropriate locale options in libc format with the collation provider modifier
(or without it) to initdb or createdb:
[initdb|createdb]
[--locale='locale'[@icu|@libc|]]
[--lc-collate='locale'[@icu|@libc|]]
where 'locale' is in the libc format, for example, 'en_US', 'ru_RU.UTF-8' or
'C'.
You can also pass the corresponding locale options without the collation
provider modifier. In this case, in initdb, the default collation provider is
libc for locales 'C' and 'POSIX' and icu for others. If you did not specify the
collation provider for the --locale/--lc-collate options in createdb, this will
be libc for locales 'C' and 'POSIX' and the default collation provider from the
template database for others. Note that, as usual, the --lc-collate option takes
precedence over the --locale option regardless of whether it contains the
modifier of collation provider or not.
Note that you can you use icu as the default collation provider only for the
locales that libc also has in your operation system. This was done in part
because we need databases with default collation and SQL_ASCII encoding for
regression tests, but ICU in PostgreSQL does not support this encoding.. Also in
this case, we don't need to unset the approriate locale environment
variables (because other programs don't understand new format).
So in fact ICU is used as the default collation/ctype provider where there is
already a choice for using the collation provider. In other places all the work
is done by libc as usual.
Note that to use icu as the default collation provider, lc_collate and lc_ctype
must be the same.
The default database collation with the collation provider and version (for ICU
collations) is stored in pg_database.datcollate in the format
'locale'@'collprivider'[.'collversion'].
Important: when you try to connect to a database,
the ability to use the selected collation provider and the version of the
default collation will be checked. But when you try to start a cluster server it
is not checked.
Important: in this commit there's no appropriate support for using
pg_upgrade/pg_dump/pg_dumpall for clusters that do not support this feature. In
this case pg_dump/pg_dumpall retrieve the LC_COLLATE database settings unchanged
(= without mentioning the provider) from the old cluster.
---
doc/src/sgml/charset.sgml | 55 ++
doc/src/sgml/ref/create_database.sgml | 8 +-
doc/src/sgml/ref/createdb.sgml | 18 +-
doc/src/sgml/ref/initdb.sgml | 9 +-
doc/src/sgml/regress.sgml | 17 +
src/backend/catalog/information_schema.sql | 2 +-
src/backend/commands/collationcmds.c | 33 +-
src/backend/commands/dbcommands.c | 152 +++-
src/backend/main/main.c | 5 +-
src/backend/regex/regc_pg_locale.c | 40 +-
src/backend/utils/adt/formatting.c | 111 ++-
src/backend/utils/adt/like.c | 16 +-
src/backend/utils/adt/pg_locale.c | 390 +++++++---
src/backend/utils/adt/selfuncs.c | 14 +-
src/backend/utils/adt/varlena.c | 270 ++++---
src/backend/utils/init/postinit.c | 118 ++-
src/backend/utils/mb/encnames.c | 4 +-
src/bin/initdb/Makefile | 2 +-
src/bin/initdb/initdb.c | 387 +++++++++-
src/bin/pg_dump/pg_dump.c | 30 +-
src/bin/psql/describe.c | 10 +-
src/bin/scripts/Makefile | 2 +-
src/bin/scripts/createdb.c | 14 +-
src/common/Makefile | 2 +-
src/common/pg_collation_fn_common.c | 90 +++
src/fe_utils/.gitignore | 1 +
src/fe_utils/Makefile | 11 +-
src/include/commands/dbcommands.h | 3 +-
src/include/common/pg_collation_fn_common.h | 22 +
src/include/pg_config.h.win32 | 4 +
src/include/port.h | 34 +
src/include/port/win32.h | 2 +-
src/include/utils/pg_locale.h | 12 +-
src/interfaces/libpq/.gitignore | 1 +
src/interfaces/libpq/Makefile | 2 +-
src/port/chklocale.c | 598 +++++++++++++++
src/test/Makefile | 2 +-
src/test/default_collation/Makefile | 28 +
src/test/default_collation/icu.utf8/.gitignore | 2 +
src/test/default_collation/icu.utf8/Makefile | 11 +
.../icu.utf8/t/001_default_collation.pl | 799 +++++++++++++++++++++
src/test/default_collation/icu/.gitignore | 2 +
src/test/default_collation/icu/Makefile | 11 +
.../icu/t/001_default_collation.pl | 605 ++++++++++++++++
src/test/default_collation/libc.utf8/.gitignore | 2 +
src/test/default_collation/libc.utf8/Makefile | 11 +
.../libc.utf8/t/001_default_collation.pl | 703 ++++++++++++++++++
src/test/default_collation/libc/.gitignore | 2 +
src/test/default_collation/libc/Makefile | 11 +
.../libc/t/001_default_collation.pl | 355 +++++++++
src/test/regress/expected/collate.icu.utf8.out | 10 +-
src/test/regress/expected/collate.linux.utf8.out | 10 +-
src/test/regress/sql/collate.icu.utf8.sql | 8 +-
src/test/regress/sql/collate.linux.utf8.sql | 8 +-
src/tools/msvc/Mkvcbuild.pm | 26 +-
55 files changed, 4730 insertions(+), 365 deletions(-)
create mode 100644 src/common/pg_collation_fn_common.c
create mode 100644 src/include/common/pg_collation_fn_common.h
create mode 100644 src/test/default_collation/Makefile
create mode 100644 src/test/default_collation/icu.utf8/.gitignore
create mode 100644 src/test/default_collation/icu.utf8/Makefile
create mode 100644 src/test/default_collation/icu.utf8/t/001_default_collation.pl
create mode 100644 src/test/default_collation/icu/.gitignore
create mode 100644 src/test/default_collation/icu/Makefile
create mode 100644 src/test/default_collation/icu/t/001_default_collation.pl
create mode 100644 src/test/default_collation/libc.utf8/.gitignore
create mode 100644 src/test/default_collation/libc.utf8/Makefile
create mode 100644 src/test/default_collation/libc.utf8/t/001_default_collation.pl
create mode 100644 src/test/default_collation/libc/.gitignore
create mode 100644 src/test/default_collation/libc/Makefile
create mode 100644 src/test/default_collation/libc/t/001_default_collation.pl
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index dc3fd34..f28e0ec 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -537,6 +537,61 @@ SELECT * FROM test1 ORDER BY a || b COLLATE "fr_FR";
a database.
</para>
+ <para>
+ You can specify the default collation provider with the <option>--locale</option>
+ and <option>--lc-collate</option> options of the <xref linkend="app-initdb"/> or
+ <xref linkend="app-createdb"/> commands, as follows:
+<programlisting>
+--locale=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]
+--lc-collate=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]
+</programlisting>
+ where <replaceable>provider</replaceable> can take the <literal>icu</literal>
+ or <literal>libc</literal> value, and <replaceable>locale</replaceable> is specified
+ in the <literal>libc</literal> format. You can only specify a single
+ locale provider after the <literal>@</literal> symbol.
+ The <literal>--lc-collate</literal> option overrides the
+ <literal>--locale</literal> setting, regardless of whether it specifies the
+ collation provider.
+ </para>
+
+ <para>
+ If you omit the collation provider options, <literal>libc</literal>
+ provider is used for <literal>C</literal> and <literal>POSIX</literal>
+ locales. For other locales, the default providers are:
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ <literal>icu</literal> at the cluster level
+ </para>
+ </listitem>
+ <listitem>
+ <para>Default collation provider from the template database at
+ the database level
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <important>
+ <para>
+ You can only use the <literal>icu</literal> collation provider for locales that are
+ supported by <literal>libc</literal> in your operating system and satisfy all
+ restrictions applicable to <literal>icu</literal>.
+ </para>
+ </important>
+
+ <para>
+ When you connect to a database,
+ <productname>PostgreSQL</productname> checks that the selected collation
+ provider and the version of the default collation are supported.
+ You can find the default database collation and the collation provider
+ in <structname>pg_database.datcollate</structname>. For ICU collations, collation version is
+ also stored:
+ <programlisting>
+<replaceable>locale</replaceable>@<replaceable>provider</replaceable>[.<replaceable>version</replaceable>]
+</programlisting>
+ </para>
+
<sect3>
<title>Standard Collations</title>
diff --git a/doc/src/sgml/ref/create_database.sgml b/doc/src/sgml/ref/create_database.sgml
index b2c9e24..8b2e153 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -25,7 +25,7 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
[ [ WITH ] [ OWNER [=] <replaceable class="parameter">user_name</replaceable> ]
[ TEMPLATE [=] <replaceable class="parameter">template</replaceable> ]
[ ENCODING [=] <replaceable class="parameter">encoding</replaceable> ]
- [ LC_COLLATE [=] <replaceable class="parameter">lc_collate</replaceable> ]
+ [ LC_COLLATE [=] <replaceable class="parameter">lc_collate</replaceable>[@<replaceable class="parameter">provider</replaceable>] ]
[ LC_CTYPE [=] <replaceable class="parameter">lc_ctype</replaceable> ]
[ TABLESPACE [=] <replaceable class="parameter">tablespace_name</replaceable> ]
[ ALLOW_CONNECTIONS [=] <replaceable class="parameter">allowconn</replaceable> ]
@@ -112,13 +112,17 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
</listitem>
</varlistentry>
<varlistentry>
- <term><replaceable class="parameter">lc_collate</replaceable></term>
+ <term><replaceable class="parameter">lc_collate</replaceable>[@<replaceable class="parameter">provider</replaceable>]</term>
<listitem>
<para>
Collation order (<literal>LC_COLLATE</literal>) to use in the new database.
This affects the sort order applied to strings, e.g. in queries with
ORDER BY, as well as the order used in indexes on text columns.
The default is to use the collation order of the template database.
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol, as explained in
+ <xref linkend="collation-managing"/>. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>.
See below for additional restrictions.
</para>
</listitem>
diff --git a/doc/src/sgml/ref/createdb.sgml b/doc/src/sgml/ref/createdb.sgml
index 2658efe..dbf87d3 100644
--- a/doc/src/sgml/ref/createdb.sgml
+++ b/doc/src/sgml/ref/createdb.sgml
@@ -121,22 +121,34 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>-l <replaceable class="parameter">locale</replaceable></option></term>
- <term><option>--locale=<replaceable class="parameter">locale</replaceable></option></term>
+ <term><option>-l <replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
+ <term><option>--locale=<replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Specifies the locale to be used in this database. This is equivalent
to specifying both <option>--lc-collate</option> and <option>--lc-ctype</option>.
</para>
+
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
<varlistentry>
- <term><option>--lc-collate=<replaceable class="parameter">locale</replaceable></option></term>
+ <term><option>--lc-collate=<replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Specifies the LC_COLLATE setting to be used in this database.
</para>
+
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 585665f..87adcd5 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -203,7 +203,7 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>--locale=<replaceable>locale</replaceable></option></term>
+ <term><option>--locale=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Sets the default locale for the database cluster. If this
@@ -211,11 +211,16 @@ PostgreSQL documentation
environment that <command>initdb</command> runs in. Locale
support is described in <xref linkend="locale"/>.
</para>
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
<varlistentry>
- <term><option>--lc-collate=<replaceable>locale</replaceable></option></term>
+ <term><option>--lc-collate=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<term><option>--lc-ctype=<replaceable>locale</replaceable></option></term>
<term><option>--lc-messages=<replaceable>locale</replaceable></option></term>
<term><option>--lc-monetary=<replaceable>locale</replaceable></option></term>
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 53716a0..fefddd8 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -280,6 +280,23 @@ make check EXTRA_TESTS='collate.icu.utf8 collate.linux.utf8' LANG=en_US.utf8
</sect2>
<sect2>
+ <title>Extra TAP Tests for Default Collations</title>
+
+ <para>
+ To test the default collations on Linux/glibc platforms,
+ you can run extra TAP tests, as follows:
+<screen>
+make -C src/test/default_collation check-utf8
+</screen>
+ These tests only succeed when run in a database that uses the UTF-8
+ encoding. As these tests are TAP-based, you can only run them if
+ <productname>PostgreSQL</productname> was configured with the
+ <option>--enable-tap-tests</option> option.
+ For details, see <xref linkend="regress-tap"/>.
+ </para>
+ </sect2>
+
+ <sect2>
<title>Testing Hot Standby</title>
<para>
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 686528c..640a9e1 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -397,7 +397,7 @@ CREATE VIEW character_sets AS
CAST(c.collname AS sql_identifier) AS default_collate_name
FROM pg_database d
LEFT JOIN (pg_collation c JOIN pg_namespace nc ON (c.collnamespace = nc.oid))
- ON (datcollate = collcollate AND datctype = collctype)
+ ON (datcollate = (collcollate || '@libc') AND datctype = collctype)
WHERE d.datname = current_database()
ORDER BY char_length(c.collname) DESC, c.collname ASC -- prefer full/canonical name
LIMIT 1;
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index d0b5cdb..db4f67d 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -28,6 +28,7 @@
#include "commands/comment.h"
#include "commands/dbcommands.h"
#include "commands/defrem.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "utils/builtins.h"
@@ -163,11 +164,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
if (collproviderstr)
{
- if (pg_strcasecmp(collproviderstr, "icu") == 0)
- collprovider = COLLPROVIDER_ICU;
- else if (pg_strcasecmp(collproviderstr, "libc") == 0)
- collprovider = COLLPROVIDER_LIBC;
- else
+ collprovider = get_collprovider(collproviderstr);
+ if (!is_valid_nondefault_collprovider(collprovider))
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("unrecognized collation provider: %s",
@@ -193,7 +191,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
else
{
collencoding = GetDatabaseEncoding();
- check_encoding_locale_matches(collencoding, collcollate, collctype);
+ check_encoding_locale_matches(collencoding, collcollate, collctype,
+ collprovider);
}
}
@@ -435,26 +434,6 @@ cmpaliases(const void *a, const void *b)
#ifdef USE_ICU
/*
- * Get the ICU language tag for a locale name.
- * The result is a palloc'd string.
- */
-static char *
-get_icu_language_tag(const char *localename)
-{
- char buf[ULOC_FULLNAME_CAPACITY];
- UErrorCode status;
-
- status = U_ZERO_ERROR;
- uloc_toLanguageTag(localename, buf, sizeof(buf), TRUE, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not convert locale name \"%s\" to language tag: %s",
- localename, u_errorName(status))));
-
- return pstrdup(buf);
-}
-
-/*
* Get a comment (specifically, the display name) for an ICU locale.
* The result is a palloc'd string, or NULL if we can't get a comment
* or find that it's not all ASCII. (We can *not* accept non-ASCII
@@ -699,7 +678,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
name = uloc_getAvailable(i);
langtag = get_icu_language_tag(name);
- collcollate = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : name;
+ collcollate = get_icu_collate(name, langtag);
/*
* Be paranoid about not allowing any non-ASCII strings into
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index d2020d0..6fbd8b3 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -34,6 +34,7 @@
#include "catalog/indexing.h"
#include "catalog/objectaccess.h"
#include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
#include "catalog/pg_db_role_setting.h"
#include "catalog/pg_subscription.h"
@@ -44,6 +45,7 @@
#include "commands/defrem.h"
#include "commands/seclabel.h"
#include "commands/tablespace.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -141,6 +143,14 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
int notherbackends;
int npreparedxacts;
createdb_failure_params fparms;
+ char *src_canonname;
+ char src_collprovider;
+ char *dbcanonname = NULL;
+ char dbcollprovider;
+ char *dbcollate_full_name;
+ char *icu_wincollate = NULL;
+ char *langtag = NULL;
+ const char *collate;
/* Extract options from the statement node tree */
foreach(option, stmt->options)
@@ -350,8 +360,28 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
/* If encoding or locales are defaulted, use source's setting */
if (encoding < 0)
encoding = src_encoding;
+
+ check_locale_collprovider(src_collate, &src_canonname, &src_collprovider,
+ NULL);
+
+ if (!is_valid_nondefault_collprovider(src_collprovider))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of template database \"%s\"",
+ src_collate, dbtemplate)));
+
if (dbcollate == NULL)
- dbcollate = src_collate;
+ {
+ dbcollate = src_canonname;
+ dbcollprovider = src_collprovider;
+ }
+ else
+ {
+ check_locale_collprovider(dbcollate, &dbcanonname, &dbcollprovider,
+ NULL);
+ dbcollate = dbcanonname;
+ }
+
if (dbctype == NULL)
dbctype = src_ctype;
@@ -362,18 +392,88 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
errmsg("invalid server encoding %d", encoding)));
/* Check that the chosen locales are valid, and get canonical spellings */
- if (!check_locale(LC_COLLATE, dbcollate, &canonname))
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("invalid locale name: \"%s\"", dbcollate)));
- dbcollate = canonname;
- if (!check_locale(LC_CTYPE, dbctype, &canonname))
+
+ if (!check_locale(LC_CTYPE, dbctype, &canonname, '\0'))
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("invalid locale name: \"%s\"", dbctype)));
dbctype = canonname;
- check_encoding_locale_matches(encoding, dbcollate, dbctype);
+ /* we always check lc_collate for libc */
+ if (!check_locale(LC_COLLATE, dbcollate, &canonname, COLLPROVIDER_LIBC))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("invalid locale name: \"%s\" (provider \"%s\")",
+ dbcollate, get_collprovider_name(COLLPROVIDER_LIBC))));
+ dbcollate = canonname;
+
+ /* determine the collation provider if we haven't already done it */
+ if (!is_valid_nondefault_collprovider(dbcollprovider))
+ {
+ if (locale_is_c(dbcollate))
+ dbcollprovider = COLLPROVIDER_LIBC;
+ else
+ dbcollprovider = src_collprovider;
+ }
+
+ Assert(is_valid_nondefault_collprovider(dbcollprovider));
+
+#ifndef USE_ICU
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"),
+ errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif
+
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ {
+ if (!check_locale(LC_COLLATE, dbcollate, NULL, dbcollprovider))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("invalid locale name: \"%s\" (provider \"%s\")",
+ dbcollate, get_collprovider_name(dbcollprovider))));
+
+ if (strcmp(dbcollate, dbctype) != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("collations with different collate and ctype values are not supported by ICU")));
+ }
+
+ check_encoding_locale_matches(encoding, dbcollate, dbctype, dbcollprovider);
+
+ /* get the collation version */
+
+#ifdef USE_ICU
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ {
+ collate = (const char *) dbcollate;
+#ifdef WIN32
+ if (!locale_is_c(collate))
+ {
+ icu_wincollate = check_icu_winlocale(collate);
+ collate = (const char *) icu_wincollate;
+ }
+#endif /* WIN32 */
+ langtag = get_icu_language_tag(collate);
+ collate = get_icu_collate(collate, langtag);
+ }
+ else
+#endif /* USE_ICU */
+ {
+ /* COLLPROVIDER_LIBC */
+ collate = (const char *) dbcollate;
+ }
+
+ dbcollate_full_name = get_full_collation_name(
+ dbcollate, dbcollprovider,
+ get_collation_actual_version(dbcollprovider, collate));
+
+ if (strlen(dbcollate_full_name) >= NAMEDATALEN)
+ ereport(ERROR,
+ (errmsg("the full database collation name \"%s\" is too long",
+ dbcollate_full_name)));
/*
* Check that the new encoding and locale settings match the source
@@ -395,11 +495,11 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
pg_encoding_to_char(src_encoding)),
errhint("Use the same encoding as in the template database, or use template0 as template.")));
- if (strcmp(dbcollate, src_collate) != 0)
+ if (strcmp(dbcollate_full_name, src_collate) != 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("new collation (%s) is incompatible with the collation of the template database (%s)",
- dbcollate, src_collate),
+ dbcollate_full_name, src_collate),
errhint("Use the same collation as in the template database, or use template0 as template.")));
if (strcmp(dbctype, src_ctype) != 0)
@@ -522,7 +622,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
new_record[Anum_pg_database_datdba - 1] = ObjectIdGetDatum(datdba);
new_record[Anum_pg_database_encoding - 1] = Int32GetDatum(encoding);
new_record[Anum_pg_database_datcollate - 1] =
- DirectFunctionCall1(namein, CStringGetDatum(dbcollate));
+ DirectFunctionCall1(namein, CStringGetDatum(dbcollate_full_name));
new_record[Anum_pg_database_datctype - 1] =
DirectFunctionCall1(namein, CStringGetDatum(dbctype));
new_record[Anum_pg_database_datistemplate - 1] = BoolGetDatum(dbistemplate);
@@ -690,6 +790,16 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
*/
ForceSyncCommit();
}
+
+ pfree(src_canonname);
+ pfree(dbcollate_full_name);
+ if (dbcanonname)
+ pfree(dbcanonname);
+ if (langtag)
+ pfree(langtag);
+ if (icu_wincollate)
+ pfree(icu_wincollate);
+
PG_END_ENSURE_ERROR_CLEANUP(createdb_failure_callback,
PointerGetDatum(&fparms));
@@ -719,7 +829,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
* Note: if you change this policy, fix initdb to match.
*/
void
-check_encoding_locale_matches(int encoding, const char *collate, const char *ctype)
+check_encoding_locale_matches(int encoding, const char *collate, const char *ctype,
+ char collprovider)
{
int ctype_encoding = pg_get_encoding_from_locale(ctype, true);
int collate_encoding = pg_get_encoding_from_locale(collate, true);
@@ -753,6 +864,23 @@ check_encoding_locale_matches(int encoding, const char *collate, const char *cty
collate),
errdetail("The chosen LC_COLLATE setting requires encoding \"%s\".",
pg_encoding_to_char(collate_encoding))));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ if (!(is_encoding_supported_by_icu(encoding) ||
+ (encoding == PG_SQL_ASCII && superuser())))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("encoding \"%s\" is not supported for ICU locales",
+ pg_encoding_to_char(encoding))));
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"),
+ errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif
+ }
}
/* Error cleanup callback for createdb */
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 38853e3..cb27d62 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -32,6 +32,7 @@
#endif
#include "bootstrap/bootstrap.h"
+#include "catalog/pg_collation.h"
#include "common/username.h"
#include "port/atomics.h"
#include "postmaster/postmaster.h"
@@ -306,8 +307,8 @@ startup_hacks(const char *progname)
static void
init_locale(const char *categoryname, int category, const char *locale)
{
- if (pg_perm_setlocale(category, locale) == NULL &&
- pg_perm_setlocale(category, "C") == NULL)
+ if (pg_perm_setlocale(category, locale, COLLPROVIDER_LIBC) == NULL &&
+ pg_perm_setlocale(category, "C", COLLPROVIDER_LIBC) == NULL)
elog(FATAL, "could not adopt \"%s\" locale nor C locale for %s",
locale, categoryname);
}
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index acbed2e..e836553 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -16,6 +16,7 @@
*/
#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
#include "utils/pg_locale.h"
/*
@@ -240,8 +241,13 @@ pg_set_regex_collation(Oid collation)
}
else
{
+ char collprovider;
+
if (collation == DEFAULT_COLLATION_OID)
+ {
pg_regex_locale = 0;
+ collprovider = get_default_collprovider();
+ }
else if (OidIsValid(collation))
{
/*
@@ -250,6 +256,7 @@ pg_set_regex_collation(Oid collation)
* have to be considered below.
*/
pg_regex_locale = pg_newlocale_from_collation(collation);
+ collprovider = pg_regex_locale->provider;
}
else
{
@@ -263,24 +270,35 @@ pg_set_regex_collation(Oid collation)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
#ifdef USE_ICU
- if (pg_regex_locale && pg_regex_locale->provider == COLLPROVIDER_ICU)
pg_regex_strategy = PG_REGEX_LOCALE_ICU;
- else
+#else
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- if (pg_regex_locale)
- pg_regex_strategy = PG_REGEX_LOCALE_WIDE_L;
- else
- pg_regex_strategy = PG_REGEX_LOCALE_WIDE;
}
else
{
- if (pg_regex_locale)
- pg_regex_strategy = PG_REGEX_LOCALE_1BYTE_L;
+ /* COLLPROVIDER_LIBC */
+
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ if (pg_regex_locale)
+ pg_regex_strategy = PG_REGEX_LOCALE_WIDE_L;
+ else
+ pg_regex_strategy = PG_REGEX_LOCALE_WIDE;
+ }
else
- pg_regex_strategy = PG_REGEX_LOCALE_1BYTE;
+ {
+ if (pg_regex_locale)
+ pg_regex_strategy = PG_REGEX_LOCALE_1BYTE_L;
+ else
+ pg_regex_strategy = PG_REGEX_LOCALE_1BYTE;
+ }
}
pg_regex_collation = collation;
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index b8bd4ca..af38e72 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -1452,7 +1452,7 @@ typedef int32_t (*ICU_Convert_Func) (UChar *dest, int32_t destCapacity,
UErrorCode *pErrorCode);
static int32_t
-icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
+icu_convert_case(ICU_Convert_Func func, const char *locale,
UChar **buff_dest, UChar *buff_source, int32_t len_source)
{
UErrorCode status;
@@ -1462,7 +1462,7 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
*buff_dest = palloc(len_dest * sizeof(**buff_dest));
status = U_ZERO_ERROR;
len_dest = func(*buff_dest, len_dest, buff_source, len_source,
- mylocale->info.icu.locale, &status);
+ locale, &status);
if (status == U_BUFFER_OVERFLOW_ERROR)
{
/* try again with adjusted length */
@@ -1470,7 +1470,7 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
*buff_dest = palloc(len_dest * sizeof(**buff_dest));
status = U_ZERO_ERROR;
len_dest = func(*buff_dest, len_dest, buff_source, len_source,
- mylocale->info.icu.locale, &status);
+ locale, &status);
}
if (U_FAILURE(status))
ereport(ERROR,
@@ -1528,8 +1528,15 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1543,25 +1550,43 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar;
int32_t len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToLower, mylocale,
+ len_conv = icu_convert_case(u_strToLower, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
@@ -1650,8 +1675,15 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1665,25 +1697,43 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar,
len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToUpper, mylocale,
+ len_conv = icu_convert_case(u_strToUpper, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
@@ -1773,8 +1823,15 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1788,25 +1845,43 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar,
len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale,
+ len_conv = icu_convert_case(u_strToTitle_default_BI, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index ff716c5..28ea64f 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -167,6 +167,9 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
plen;
pg_locale_t locale = 0;
bool locale_is_c = false;
+ char collprovider = COLLPROVIDER_LIBC;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY;
+ bool use_icu;
if (lc_ctype_is_c(collation))
locale_is_c = true;
@@ -184,7 +187,18 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collation);
+ collprovider = locale->provider;
}
+ else
+ {
+ collprovider = get_default_collprovider();
+ }
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
/*
* For efficiency reasons, in the single byte case we don't call lower()
@@ -194,7 +208,7 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
* way.
*/
- if (pg_database_encoding_max_length() > 1 || (locale && locale->provider == COLLPROVIDER_ICU))
+ if (pg_database_encoding_max_length() > 1 || use_icu)
{
/* lower's result is never packed, so OK to use old macros here */
pat = DatumGetTextPP(DirectFunctionCall1Coll(lower, collation,
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index a3dc3be..5d7c66b 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -56,7 +56,10 @@
#include "access/htup_details.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_control.h"
+#include "catalog/pg_database.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
+#include "miscadmin.h"
#include "utils/builtins.h"
#include "utils/hsearch.h"
#include "utils/lsyscache.h"
@@ -132,6 +135,10 @@ static HTAB *collation_cache = NULL;
static char *IsoLocaleName(const char *); /* MSVC specific */
#endif
+#ifdef USE_ICU
+static char *check_icu_locale(const char *locale);
+#endif
+
/*
* pg_perm_setlocale
@@ -146,13 +153,45 @@ static char *IsoLocaleName(const char *); /* MSVC specific */
* also be unset to fully ensure that, but that has to be done elsewhere after
* all the individual LC_XXX variables have been set correctly. (Thank you
* Perl for making this kluge necessary.)
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
-char *
-pg_perm_setlocale(int category, const char *locale)
+const char *
+pg_perm_setlocale(int category, const char *locale, char collprovider)
{
- char *result;
+ const char *result;
const char *envvar;
char *envbuf;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
+ {
+#ifdef USE_ICU
+ UErrorCode status = U_ZERO_ERROR;
+ char *icu_locale = check_icu_locale(locale);
+
+ if (icu_locale == NULL && locale != NULL)
+ return NULL; /* fall out immediately on failure */
+
+ uloc_setDefault(icu_locale, &status);
+ if (U_FAILURE(status))
+ return NULL; /* fall out immediately on failure */
+
+ result = uloc_getDefault();
+ if (icu_locale)
+ pfree(icu_locale);
+ return result;
+#else /* not USE_ICU */
+ return NULL; /* fall out immediately on failure */
+#endif /* not USE_ICU */
+ }
+
+ /* use libc */
#ifndef WIN32
result = setlocale(category, locale);
@@ -167,7 +206,7 @@ pg_perm_setlocale(int category, const char *locale)
#ifdef LC_MESSAGES
if (category == LC_MESSAGES)
{
- result = (char *) locale;
+ result = locale;
if (locale == NULL || locale[0] == '\0')
return result;
}
@@ -218,7 +257,7 @@ pg_perm_setlocale(int category, const char *locale)
#ifdef WIN32
result = IsoLocaleName(locale);
if (result == NULL)
- result = (char *) locale;
+ result = locale;
#endif /* WIN32 */
break;
#endif /* LC_MESSAGES */
@@ -259,34 +298,102 @@ pg_perm_setlocale(int category, const char *locale)
* it seems that on most implementations that's the only thing it's good for;
* we could wish that setlocale gave back a canonically spelled version of
* the locale name, but typically it doesn't.)
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
bool
-check_locale(int category, const char *locale, char **canonname)
+check_locale(int category, const char *locale, char **canonname,
+ char collprovider)
{
- char *save;
- char *res;
+ const char *save;
+ const char *res;
+ char *save_dup;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+#ifdef USE_ICU
+ UErrorCode status;
+ char *icu_locale;
+#endif
+
+ Assert(use_libc || use_icu);
if (canonname)
*canonname = NULL; /* in case of failure */
- save = setlocale(category, NULL);
- if (!save)
- return false; /* won't happen, we hope */
+#ifndef USE_ICU
+ /* cannot use icu functions */
+ if (use_icu)
+ return false;
+#endif
+
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ save = uloc_getDefault();
+ if (!save)
+ return false; /* won't happen, we hope */
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ save = setlocale(category, NULL);
+ if (!save)
+ return false; /* won't happen, we hope */
+ }
/* save may be pointing at a modifiable scratch variable, see above. */
- save = pstrdup(save);
+ save_dup = pstrdup(save);
/* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ icu_locale = check_icu_locale(locale);
+
+ if (icu_locale == NULL && locale != NULL)
+ return false; /* won't happen, we hope */
+
+ status = U_ZERO_ERROR;
+ uloc_setDefault(icu_locale, &status);
+ if (U_FAILURE(status))
+ return false; /* won't happen, we hope */
+
+ res = uloc_getDefault();
+ if (icu_locale)
+ pfree(icu_locale);
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ res = setlocale(category, locale);
+ }
/* save canonical name if requested. */
if (res && canonname)
*canonname = pstrdup(res);
/* restore old value. */
- if (!setlocale(category, save))
- elog(WARNING, "failed to restore old locale \"%s\"", save);
- pfree(save);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ elog(WARNING, "ICU error: failed to restore old locale \"%s\"",
+ save_dup);
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ if (!setlocale(category, save_dup))
+ elog(WARNING, "failed to restore old locale \"%s\"", save_dup);
+ }
+ pfree(save_dup);
return (res != NULL);
}
@@ -306,7 +413,7 @@ check_locale(int category, const char *locale, char **canonname)
bool
check_locale_monetary(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_MONETARY, *newval, NULL);
+ return check_locale(LC_MONETARY, *newval, NULL, '\0');
}
void
@@ -318,7 +425,7 @@ assign_locale_monetary(const char *newval, void *extra)
bool
check_locale_numeric(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_NUMERIC, *newval, NULL);
+ return check_locale(LC_NUMERIC, *newval, NULL, '\0');
}
void
@@ -330,7 +437,7 @@ assign_locale_numeric(const char *newval, void *extra)
bool
check_locale_time(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_TIME, *newval, NULL);
+ return check_locale(LC_TIME, *newval, NULL, '\0');
}
void
@@ -366,7 +473,7 @@ check_locale_messages(char **newval, void **extra, GucSource source)
* On Windows, we can't even check the value, so accept blindly
*/
#if defined(LC_MESSAGES) && !defined(WIN32)
- return check_locale(LC_MESSAGES, *newval, NULL);
+ return check_locale(LC_MESSAGES, *newval, NULL, '\0');
#else
return true;
#endif
@@ -380,7 +487,7 @@ assign_locale_messages(const char *newval, void *extra)
* We ignore failure, as per comment above.
*/
#ifdef LC_MESSAGES
- (void) pg_perm_setlocale(LC_MESSAGES, newval);
+ (void) pg_perm_setlocale(LC_MESSAGES, newval, '\0');
#endif
}
@@ -1096,21 +1203,14 @@ lookup_collation_cache(Oid collation, bool set_flags)
/* Attempt to set the flags */
HeapTuple tp;
Form_pg_collation collform;
- const char *collcollate;
- const char *collctype;
tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collation));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for collation %u", collation);
collform = (Form_pg_collation) GETSTRUCT(tp);
- collcollate = NameStr(collform->collcollate);
- collctype = NameStr(collform->collctype);
-
- cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
- (strcmp(collcollate, "POSIX") == 0));
- cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
- (strcmp(collctype, "POSIX") == 0));
+ cache_entry->collate_is_c = locale_is_c(NameStr(collform->collcollate));
+ cache_entry->ctype_is_c = locale_is_c(NameStr(collform->collctype));
cache_entry->flags_valid = true;
@@ -1141,20 +1241,28 @@ lc_collate_is_c(Oid collation)
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
+ char collprovider;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_COLLATE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_COLLATE setting");
-
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
+
+ collprovider = get_default_collprovider();
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
result = false;
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ char *localeptr = setlocale(LC_COLLATE, NULL);
+
+ if (!localeptr)
+ elog(ERROR, "invalid LC_COLLATE setting");
+
+ result = locale_is_c(localeptr);
+ }
return (bool) result;
}
@@ -1191,20 +1299,28 @@ lc_ctype_is_c(Oid collation)
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
+ char collprovider;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_CTYPE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_CTYPE setting");
-
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
+
+ collprovider = get_default_collprovider();
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
result = false;
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ char *localeptr = setlocale(LC_CTYPE, NULL);
+
+ if (!localeptr)
+ elog(ERROR, "invalid LC_CTYPE setting");
+
+ result = locale_is_c(localeptr);
+ }
return (bool) result;
}
@@ -1365,25 +1481,15 @@ pg_newlocale_from_collation(Oid collid)
else if (collform->collprovider == COLLPROVIDER_ICU)
{
#ifdef USE_ICU
- UCollator *collator;
- UErrorCode status;
-
if (strcmp(collcollate, collctype) != 0)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("collations with different collate and ctype values are not supported by ICU")));
- status = U_ZERO_ERROR;
- collator = ucol_open(collcollate, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not open collator for locale \"%s\": %s",
- collcollate, u_errorName(status))));
-
/* We will leak this string if we get an error below :-( */
result.info.icu.locale = MemoryContextStrdup(TopMemoryContext,
collcollate);
- result.info.icu.ucol = collator;
+ result.info.icu.ucol = open_collator(collcollate);
#else /* not USE_ICU */
/* could get here if a collation was created by a build with ICU */
ereport(ERROR,
@@ -1440,46 +1546,6 @@ pg_newlocale_from_collation(Oid collid)
return cache_entry->locale;
}
-/*
- * Get provider-specific collation version string for the given collation from
- * the operating system/library.
- *
- * A particular provider must always either return a non-NULL string or return
- * NULL (if it doesn't support versions). It must not return NULL for some
- * collcollate and not NULL for others.
- */
-char *
-get_collation_actual_version(char collprovider, const char *collcollate)
-{
- char *collversion;
-
-#ifdef USE_ICU
- if (collprovider == COLLPROVIDER_ICU)
- {
- UCollator *collator;
- UErrorCode status;
- UVersionInfo versioninfo;
- char buf[U_MAX_VERSION_STRING_LENGTH];
-
- status = U_ZERO_ERROR;
- collator = ucol_open(collcollate, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not open collator for locale \"%s\": %s",
- collcollate, u_errorName(status))));
- ucol_getVersion(collator, versioninfo);
- ucol_close(collator);
-
- u_versionToString(versioninfo, buf);
- collversion = pstrdup(buf);
- }
- else
-#endif
- collversion = NULL;
-
- return collversion;
-}
-
#ifdef USE_ICU
/*
@@ -1761,3 +1827,125 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
return result;
}
+
+#ifdef USE_ICU
+/*
+ * If locale is "" return the environment value from setlocale().
+ *
+ * Otherwise return a malloc'd copy of locale if it is not NULL.
+ */
+static char *
+check_icu_locale(const char *locale)
+{
+ char *canonname = NULL;
+ char *winlocale = NULL;
+ char *result;
+
+ /* Windows locales can be in the format ".codepage" */
+ if (locale && (strlen(locale) == 0 || locale[0] == '.'))
+ {
+ check_locale(LC_COLLATE, locale, &canonname, COLLPROVIDER_LIBC);
+ locale = (const char *) canonname;
+ }
+
+#ifdef WIN32
+ if (!locale_is_c(locale))
+ {
+ winlocale = check_icu_winlocale(locale);
+ locale = (const char *) winlocale;
+ }
+#endif
+
+ result = locale ? pstrdup(locale) : NULL;
+
+ if (canonname)
+ pfree(canonname);
+ if (winlocale)
+ pfree(winlocale);
+
+ return result;
+}
+
+/*
+ * Get the default icu collation.
+ */
+const char *
+get_icu_default_collate(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static char result[NAMEDATALEN];
+ static bool cached = false;
+ const char *locale,
+ *collate;
+ char *langtag;
+
+ if (cached)
+ return result;
+
+ locale = uloc_getDefault();
+ if (!locale)
+ ereport(ERROR, (errmsg("ICU error: uloc_getDefault() failed")));
+
+ langtag = get_icu_language_tag(locale);
+ collate = get_icu_collate(locale, langtag);
+
+ if (strlen(collate) >= NAMEDATALEN)
+ ereport(FATAL,
+ (errmsg("the default ICU collation name \"%s\" is too long", collate)));
+
+ strcpy(result, collate);
+ cached = true;
+
+ pfree(langtag);
+ return result;
+}
+
+/*
+ * Get the collator for the default ICU collation.
+ */
+UCollator *
+get_default_collation_collator(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static UCollator *collator = NULL;
+
+ if (collator)
+ return collator;
+
+ collator = open_collator(get_icu_default_collate());
+ return collator;
+}
+#endif /* USE_ICU */
+
+/*
+ * Get the default collation provider.
+ */
+char
+get_default_collprovider(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static char result = '\0';
+ HeapTuple tp;
+ Form_pg_database dbform;
+ char *datcollate;
+
+ if (result)
+ return result;
+
+ tp = SearchSysCache1(DATABASEOID, ObjectIdGetDatum(MyDatabaseId));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for database %u", MyDatabaseId);
+
+ dbform = (Form_pg_database) GETSTRUCT(tp);
+ datcollate = NameStr(dbform->datcollate);
+ check_locale_collprovider(datcollate, NULL, &result, NULL);
+
+ if (!is_valid_nondefault_collprovider(result))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of database \"%s\"",
+ datcollate, NameStr(dbform->datname))));
+
+ ReleaseSysCache(tp);
+ return result;
+}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fcc8323..807311a 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5689,13 +5689,14 @@ find_join_input_rel(PlannerInfo *root, Relids relids)
*/
static int
pattern_char_isalpha(char c, bool is_multibyte,
- pg_locale_t locale, bool locale_is_c)
+ pg_locale_t locale, char collprovider, bool locale_is_c)
{
if (locale_is_c)
return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
else if (is_multibyte && IS_HIGHBIT_SET(c))
return true;
- else if (locale && locale->provider == COLLPROVIDER_ICU)
+ else if (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII)
return IS_HIGHBIT_SET(c) ? true : false;
#ifdef HAVE_LOCALE_T
else if (locale && locale->provider == COLLPROVIDER_LIBC)
@@ -5731,6 +5732,7 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
bool is_multibyte = (pg_database_encoding_max_length() > 1);
pg_locale_t locale = 0;
bool locale_is_c = false;
+ char collprovider = COLLPROVIDER_LIBC;
/* the right-hand const is type text or bytea */
Assert(typeid == BYTEAOID || typeid == TEXTOID);
@@ -5759,6 +5761,11 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collation);
+ collprovider = locale->provider;
+ }
+ else
+ {
+ collprovider = get_default_collprovider();
}
}
@@ -5796,7 +5803,8 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
/* Stop if case-varying character (it's sort of a wildcard) */
if (case_insensitive &&
- pattern_char_isalpha(patt[pos], is_multibyte, locale, locale_is_c))
+ pattern_char_isalpha(patt[pos], is_multibyte, locale,
+ collprovider, locale_is_c))
break;
match[match_pos++] = patt[pos];
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 304cb26..e413e8b 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1402,8 +1402,15 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
char *a1p,
*a2p;
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1417,8 +1424,15 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
/*
* memcmp() can't tell us which of two unequal strings sorts first,
* but it's a cheap way to tell if they're equal. Testing shows that
@@ -1433,8 +1447,7 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
#ifdef WIN32
/* Win32 does not have UTF-8, so we need to map to UTF-16 */
- if (GetDatabaseEncoding() == PG_UTF8
- && (!mylocale || mylocale->provider == COLLPROVIDER_LIBC))
+ if (GetDatabaseEncoding() == PG_UTF8 && use_libc)
{
int a1len;
int a2len;
@@ -1536,60 +1549,67 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
memcpy(a2p, arg2, len2);
a2p[len2] = '\0';
- if (mylocale)
+ if (use_icu)
{
- if (mylocale->provider == COLLPROVIDER_ICU)
- {
#ifdef USE_ICU
+ UCollator *collator;
+
+ if (mylocale)
+ collator = mylocale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
+
#ifdef HAVE_UCOL_STRCOLLUTF8
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- UErrorCode status;
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ UErrorCode status;
- status = U_ZERO_ERROR;
- result = ucol_strcollUTF8(mylocale->info.icu.ucol,
- arg1, len1,
- arg2, len2,
- &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("collation failed: %s", u_errorName(status))));
- }
- else
+ status = U_ZERO_ERROR;
+ result = ucol_strcollUTF8(collator,
+ arg1, len1,
+ arg2, len2,
+ &status);
+ if (U_FAILURE(status))
+ ereport(ERROR,
+ (errmsg("collation failed: %s", u_errorName(status))));
+ }
+ else
#endif
- {
- int32_t ulen1,
- ulen2;
- UChar *uchar1,
- *uchar2;
+ {
+ int32_t ulen1,
+ ulen2;
+ UChar *uchar1,
+ *uchar2;
- ulen1 = icu_to_uchar(&uchar1, arg1, len1);
- ulen2 = icu_to_uchar(&uchar2, arg2, len2);
+ ulen1 = icu_to_uchar(&uchar1, arg1, len1);
+ ulen2 = icu_to_uchar(&uchar2, arg2, len2);
- result = ucol_strcoll(mylocale->info.icu.ucol,
- uchar1, ulen1,
- uchar2, ulen2);
+ result = ucol_strcoll(collator,
+ uchar1, ulen1,
+ uchar2, ulen2);
- pfree(uchar1);
- pfree(uchar2);
- }
+ pfree(uchar1);
+ pfree(uchar2);
+ }
#else /* not USE_ICU */
- /* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif /* not USE_ICU */
- }
- else
- {
+ }
+ else
+ {
+ /* use_libc */
+
+ if (mylocale)
#ifdef HAVE_LOCALE_T
result = strcoll_l(a1p, a2p, mylocale->info.lt);
#else
/* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- }
+ else
+ result = strcoll(a1p, a2p);
}
- else
- result = strcoll(a1p, a2p);
/*
* In some locales strcoll() can claim that nonidentical strings are
@@ -1811,6 +1831,9 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
bool collate_c = false;
VarStringSortSupport *sss;
pg_locale_t locale = 0;
+ char collprovider = '\0';
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY = false;
+ bool use_icu = false;
/*
* If possible, set ssup->comparator to a function which can be used to
@@ -1840,7 +1863,11 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
* we'll figure out the collation based on the locale id and cache the
* result.
*/
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1854,8 +1881,15 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collid);
+ collprovider = locale->provider;
}
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
/*
* There is a further exception on Windows. When the database
* encoding is UTF-8 and we are not using the C collation, complex
@@ -1865,8 +1899,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
* trampoline. ICU locales work just the same on Windows, however.
*/
#ifdef WIN32
- if (GetDatabaseEncoding() == PG_UTF8 &&
- !(locale && locale->provider == COLLPROVIDER_ICU))
+ if (GetDatabaseEncoding() == PG_UTF8 && use_libc)
return;
#endif
@@ -1895,7 +1928,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
* platforms.
*/
#ifndef TRUST_STRXFRM
- if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
+ if (!collate_c && !use_icu)
abbreviate = false;
#endif
@@ -2037,6 +2070,9 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
VarString *arg2 = DatumGetVarStringPP(y);
bool arg1_match;
VarStringSortSupport *sss = (VarStringSortSupport *) ssup->ssup_extra;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
/* working state */
char *a1p,
@@ -2130,59 +2166,77 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
}
if (sss->locale)
+ collprovider = sss->locale->provider;
+ else
+ collprovider = get_default_collprovider();
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
- if (sss->locale->provider == COLLPROVIDER_ICU)
- {
#ifdef USE_ICU
-#ifdef HAVE_UCOL_STRCOLLUTF8
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- UErrorCode status;
+ UCollator *collator;
- status = U_ZERO_ERROR;
- result = ucol_strcollUTF8(sss->locale->info.icu.ucol,
- a1p, len1,
- a2p, len2,
- &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("collation failed: %s", u_errorName(status))));
- }
- else
+ if (sss->locale)
+ collator = sss->locale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
+
+#ifdef HAVE_UCOL_STRCOLLUTF8
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ UErrorCode status;
+
+ status = U_ZERO_ERROR;
+ result = ucol_strcollUTF8(collator,
+ a1p, len1,
+ a2p, len2,
+ &status);
+ if (U_FAILURE(status))
+ ereport(ERROR,
+ (errmsg("collation failed: %s", u_errorName(status))));
+ }
+ else
#endif
- {
- int32_t ulen1,
- ulen2;
- UChar *uchar1,
- *uchar2;
+ {
+ int32_t ulen1,
+ ulen2;
+ UChar *uchar1,
+ *uchar2;
- ulen1 = icu_to_uchar(&uchar1, a1p, len1);
- ulen2 = icu_to_uchar(&uchar2, a2p, len2);
+ ulen1 = icu_to_uchar(&uchar1, a1p, len1);
+ ulen2 = icu_to_uchar(&uchar2, a2p, len2);
- result = ucol_strcoll(sss->locale->info.icu.ucol,
- uchar1, ulen1,
- uchar2, ulen2);
+ result = ucol_strcoll(collator,
+ uchar1, ulen1,
+ uchar2, ulen2);
- pfree(uchar1);
- pfree(uchar2);
- }
+ pfree(uchar1);
+ pfree(uchar2);
+ }
#else /* not USE_ICU */
- /* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif /* not USE_ICU */
- }
- else
- {
+ }
+ else
+ {
+ /* use_libc */
+
+ if (sss->locale)
#ifdef HAVE_LOCALE_T
result = strcoll_l(sss->buf1, sss->buf2, sss->locale->info.lt);
#else
/* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- }
+ else
+ result = strcoll(sss->buf1, sss->buf2);
}
- else
- result = strcoll(sss->buf1, sss->buf2);
/*
* In some locales strcoll() can claim that nonidentical strings are
@@ -2287,6 +2341,9 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
else
{
Size bsize;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
#ifdef USE_ICU
int32_t ulen = -1;
UChar *uchar = NULL;
@@ -2323,10 +2380,20 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
sss->buf1[len] = '\0';
sss->last_len1 = len;
+ if (sss->locale)
+ collprovider = sss->locale->provider;
+ else
+ collprovider = get_default_collprovider();
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
#ifdef USE_ICU
/* When using ICU and not UTF8, convert string to UChar. */
- if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU &&
- GetDatabaseEncoding() != PG_UTF8)
+ if (use_icu && GetDatabaseEncoding() != PG_UTF8)
ulen = icu_to_uchar(&uchar, sss->buf1, len);
#endif
@@ -2340,9 +2407,15 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
*/
for (;;)
{
-#ifdef USE_ICU
- if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+ if (use_icu)
{
+#ifdef USE_ICU
+ UCollator *collator;
+
+ if (sss->locale)
+ collator = sss->locale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
/*
* When using UTF8, use the iteration interface so we only
* need to produce as many bytes as we actually need.
@@ -2356,7 +2429,7 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
uiter_setUTF8(&iter, sss->buf1, len);
state[0] = state[1] = 0; /* won't need that again */
status = U_ZERO_ERROR;
- bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol,
+ bsize = ucol_nextSortKeyPart(collator,
&iter,
state,
(uint8_t *) sss->buf2,
@@ -2368,19 +2441,26 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
u_errorName(status))));
}
else
- bsize = ucol_getSortKey(sss->locale->info.icu.ucol,
+ bsize = ucol_getSortKey(collator,
uchar, ulen,
(uint8_t *) sss->buf2, sss->buflen2);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
+ {
+ /* use_libc */
+
#ifdef HAVE_LOCALE_T
- if (sss->locale && sss->locale->provider == COLLPROVIDER_LIBC)
- bsize = strxfrm_l(sss->buf2, sss->buf1,
- sss->buflen2, sss->locale->info.lt);
- else
+ if (sss->locale)
+ bsize = strxfrm_l(sss->buf2, sss->buf1,
+ sss->buflen2, sss->locale->info.lt);
+ else
#endif
- bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
+ bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
+ }
sss->last_len2 = bsize;
if (bsize < sss->buflen2)
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 4846289..9f588ba 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -29,9 +29,11 @@
#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
#include "catalog/pg_db_role_setting.h"
#include "catalog/pg_tablespace.h"
+#include "common/pg_collation_fn_common.h"
#include "libpq/auth.h"
#include "libpq/libpq-be.h"
#include "mb/pg_wchar.h"
@@ -296,6 +298,13 @@ CheckMyDatabase(const char *name, bool am_superuser)
Form_pg_database dbform;
char *collate;
char *ctype;
+ char *datcollate;
+ char collprovider;
+ char *collversion;
+ char *wincollate = NULL;
+ char *langtag = NULL;
+ const char *collcollate;
+ char *actual_versionstr;
/* Fetch our pg_database row normally, via syscache */
tup = SearchSysCache1(DATABASEOID, ObjectIdGetDatum(MyDatabaseId));
@@ -377,27 +386,124 @@ CheckMyDatabase(const char *name, bool am_superuser)
PGC_BACKEND, PGC_S_DYNAMIC_DEFAULT);
/* assign locale variables */
- collate = NameStr(dbform->datcollate);
ctype = NameStr(dbform->datctype);
+ datcollate = NameStr(dbform->datcollate);
+ check_locale_collprovider(datcollate, &collate, &collprovider,
+ &collversion);
- if (pg_perm_setlocale(LC_COLLATE, collate) == NULL)
+ if (!is_valid_nondefault_collprovider(collprovider))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of database \"%s\"",
+ datcollate, name)));
+
+#ifndef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ ereport(FATAL,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"), \
+ errhint("Recreate the database with libc locale or rebuild PostgreSQL using --with-icu.")));
+#endif
+
+ /* we always check lc_collate for libc */
+ if (pg_perm_setlocale(LC_COLLATE, collate, COLLPROVIDER_LIBC) == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
- errdetail("The database was initialized with LC_COLLATE \"%s\", "
- " which is not recognized by setlocale().", collate),
+ errdetail("The database was initialized with LC_COLLATE \"%s\" (provider \"%s\"), "
+ " which is not recognized by setlocale().",
+ collate, get_collprovider_name(COLLPROVIDER_LIBC)),
errhint("Recreate the database with another locale or install the missing locale.")));
- if (pg_perm_setlocale(LC_CTYPE, ctype) == NULL)
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ if (pg_perm_setlocale(LC_COLLATE, collate, collprovider) == NULL)
+ ereport(FATAL,
+ (errmsg("database locale is incompatible with operating system"),
+ errdetail("The database was initialized with LC_COLLATE \"%s\" (provider \"%s\"), "
+ " which is not recognized by uloc_setDefault().",
+ collate, get_collprovider_name(collprovider)),
+ errhint("Recreate the database with another locale or install the missing locale.")));
+
+ /* This could happen when manually creating a mess in the catalogs. */
+ if (strcmp(collate, ctype) != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("collations with different collate and ctype values are not supported by ICU")));
+ }
+
+ if (pg_perm_setlocale(LC_CTYPE, ctype, '\0') == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
errdetail("The database was initialized with LC_CTYPE \"%s\", "
" which is not recognized by setlocale().", ctype),
errhint("Recreate the database with another locale or install the missing locale.")));
+ /* get the actual version of the collation */
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ collcollate = (const char *) collate;
+#ifdef WIN32
+ if (!locale_is_c(collcollate))
+ {
+ wincollate = check_icu_winlocale(collcollate);
+ collcollate = (const char *) wincollate;
+ }
+#endif /* WIN32 */
+ langtag = get_icu_language_tag(collcollate);
+ collcollate = get_icu_collate(collcollate, langtag);
+ }
+ else
+#endif /* USE_ICU */
+ {
+ /* COLLPROVIDER_LIBC */
+ collcollate = (const char *) collate;
+ }
+
+ actual_versionstr = get_collation_actual_version(collprovider, collcollate);
+
+ /*
+ * Check the collation version (this matches the version checking in the
+ * function pg_newlocale_from_collation())
+ */
+ if (collversion)
+ {
+ if (!actual_versionstr)
+ {
+ /*
+ * This could happen when manually creating a mess in the catalogs.
+ */
+ ereport(ERROR,
+ (errmsg("collation \"%s\" (provider \"%s\") has no actual version, but a version was specified",
+ collate, get_collprovider_name(collprovider))));
+ }
+
+ if (strcmp(actual_versionstr, collversion) != 0)
+ ereport(ERROR,
+ (errmsg("collation \"%s\" (provider \"%s\") has version mismatch",
+ collate, get_collprovider_name(collprovider)),
+ errdetail("The collation in the database was created using version %s, "
+ "but the operating system provides version %s.",
+ collversion, actual_versionstr),
+ errhint("Build PostgreSQL with the right library version.")));
+ }
+
/* Make the locale settings visible as GUC variables, too */
- SetConfigOption("lc_collate", collate, PGC_INTERNAL, PGC_S_OVERRIDE);
+ SetConfigOption("lc_collate", datcollate, PGC_INTERNAL, PGC_S_OVERRIDE);
SetConfigOption("lc_ctype", ctype, PGC_INTERNAL, PGC_S_OVERRIDE);
+ pfree(collate);
+ if (collversion)
+ pfree(collversion);
+ if (langtag)
+ pfree(langtag);
+ if (actual_versionstr)
+ pfree(actual_versionstr);
+ if (wincollate)
+ pfree(wincollate);
+
check_strxfrm_bug();
ReleaseSysCache(tup);
diff --git a/src/backend/utils/mb/encnames.c b/src/backend/utils/mb/encnames.c
index 12b61cd..1e75257 100644
--- a/src/backend/utils/mb/encnames.c
+++ b/src/backend/utils/mb/encnames.c
@@ -403,8 +403,6 @@ const pg_enc2gettext pg_enc2gettext_tbl[] =
};
-#ifndef FRONTEND
-
/*
* Table of encoding names for ICU
*
@@ -457,6 +455,7 @@ is_encoding_supported_by_icu(int encoding)
return (pg_enc2icu_tbl[encoding] != NULL);
}
+#ifndef FRONTEND
const char *
get_encoding_name_for_icu(int encoding)
{
@@ -475,7 +474,6 @@ get_encoding_name_for_icu(int encoding)
return icu_encoding_name;
}
-
#endif /* not FRONTEND */
diff --git a/src/bin/initdb/Makefile b/src/bin/initdb/Makefile
index dae3daf..27415b8 100644
--- a/src/bin/initdb/Makefile
+++ b/src/bin/initdb/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) -I$(top_srcdir)/src/timezone $(CPPFLAGS)
# note: we need libpq only because fe_utils does
-override LDFLAGS := -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(LDFLAGS)
+override LDFLAGS := -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS) $(LDFLAGS)
# use system timezone data?
ifneq (,$(with_system_tzdata))
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 2efd3b7..2d3b90d 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -55,6 +55,10 @@
#include <signal.h>
#include <time.h>
+#ifdef USE_ICU
+#include <unicode/uloc.h>
+#endif
+
#ifdef HAVE_SHM_OPEN
#include "sys/mman.h"
#endif
@@ -65,6 +69,7 @@
#include "catalog/pg_class.h"
#include "catalog/pg_collation.h"
#include "common/file_utils.h"
+#include "common/pg_collation_fn_common.h"
#include "common/restricted_token.h"
#include "common/username.h"
#include "fe_utils/string_utils.h"
@@ -144,6 +149,8 @@ static bool data_checksums = false;
static char *xlog_dir = NULL;
static char *str_wal_segment_size_mb = NULL;
static int wal_segment_size_mb;
+static char collprovider = '\0';
+static char *collversion = NULL;
/* internal vars */
@@ -267,10 +274,15 @@ static void check_ok(void);
static char *escape_quotes(const char *src);
static int locale_date_order(const char *locale);
static void check_locale_name(int category, const char *locale,
- char **canonname);
-static bool check_locale_encoding(const char *locale, int encoding);
+ char **canonname, char collprovider);
+static bool check_locale_encoding(const char *locale, int encoding,
+ char collprovider);
static void setlocales(void);
static void usage(const char *progname);
+#ifdef USE_ICU
+static char *check_icu_locale_name(const char *locale);
+#endif
+static void set_collation_version(void);
void setup_pgdata(void);
void setup_bin_paths(const char *argv0);
void setup_data_file_paths(void);
@@ -1317,10 +1329,27 @@ bootstrap_template1(void)
char **bki_lines;
char headerline[MAXPGPATH];
char buf[64];
+ char *lc_collate_full_name;
printf(_("running bootstrap script ... "));
fflush(stdout);
+ Assert(lc_collate);
+
+ lc_collate_full_name = get_full_collation_name(lc_collate, collprovider,
+ collversion);
+
+ if (!lc_collate_full_name)
+ exit(1); /* get_full_collation_name printed the error */
+
+ if (strlen(lc_collate_full_name) >= NAMEDATALEN)
+ {
+ fprintf(stderr,
+ _("%s: the full collation name \"%s\" is too long\n"),
+ progname, lc_collate_full_name);
+ exit(1);
+ }
+
bki_lines = readfile(bki_file);
/* Check that bki file appears to be of the right version */
@@ -1359,7 +1388,8 @@ bootstrap_template1(void)
bki_lines = replace_token(bki_lines, "ENCODING", encodingid_to_string(encodingid));
- bki_lines = replace_token(bki_lines, "LC_COLLATE", escape_quotes(lc_collate));
+ bki_lines = replace_token(bki_lines, "LC_COLLATE",
+ escape_quotes(lc_collate_full_name));
bki_lines = replace_token(bki_lines, "LC_CTYPE", escape_quotes(lc_ctype));
@@ -1400,6 +1430,7 @@ bootstrap_template1(void)
PG_CMD_CLOSE;
free(bki_lines);
+ free(lc_collate_full_name);
check_ok();
}
@@ -2143,53 +2174,143 @@ locale_date_order(const char *locale)
* the locale name, but typically it doesn't.)
*
* this should match the backend's check_locale() function
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
static void
-check_locale_name(int category, const char *locale, char **canonname)
+check_locale_name(int category, const char *locale, char **canonname,
+ char collprovider)
{
- char *save;
- char *res;
+ const char *save;
+ const char *res;
+ char *save_dup;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+ bool failure = false;
+#ifdef USE_ICU
+ UErrorCode status;
+ char *icu_locale;
+#endif
- if (canonname)
- *canonname = NULL; /* in case of failure */
+ Assert(use_libc || use_icu);
- save = setlocale(category, NULL);
- if (!save)
+#ifndef USE_ICU
+ if (use_icu)
{
- fprintf(stderr, _("%s: setlocale() failed\n"),
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
progname);
exit(1);
}
+#endif
+
+ if (canonname)
+ *canonname = NULL; /* in case of failure */
+
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ save = uloc_getDefault();
+ if (!save)
+ {
+ fprintf(stderr, _("%s: ICU error: uloc_getDefault() failed\n"),
+ progname);
+ exit(1);
+ }
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ save = setlocale(category, NULL);
+ if (!save)
+ {
+ fprintf(stderr, _("%s: setlocale() failed\n"),
+ progname);
+ exit(1);
+ }
+ }
/* save may be pointing at a modifiable scratch variable, so copy it. */
- save = pg_strdup(save);
+ save_dup = pg_strdup(save);
/* for setlocale() call */
if (!locale)
locale = "";
/* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ icu_locale = check_icu_locale_name(locale);
+ if (icu_locale == NULL && locale != NULL)
+ {
+ failure = true;
+ res = NULL;
+ }
+ else
+ {
+ status = U_ZERO_ERROR;
+ uloc_setDefault(icu_locale, &status);
+ res = uloc_getDefault();
+ failure = (U_FAILURE(status) || res == NULL);
+ if (icu_locale)
+ pfree(icu_locale);
+ }
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ res = setlocale(category, locale);
+ failure = (res == NULL);
+ }
/* save canonical name if requested. */
if (res && canonname)
*canonname = pg_strdup(res);
/* restore old value. */
- if (!setlocale(category, save))
+#ifdef USE_ICU
+ if (use_icu)
{
- fprintf(stderr, _("%s: failed to restore old locale \"%s\"\n"),
- progname, save);
- exit(1);
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ {
+ fprintf(stderr, _("%s: ICU error: failed to restore old locale \"%s\"\n"),
+ progname, save_dup);
+ exit(1);
+ }
}
- free(save);
+ else
+#endif
+ {
+ /* use_libc */
+ if (!setlocale(category, save_dup))
+ {
+ fprintf(stderr, _("%s: failed to restore old locale \"%s\"\n"),
+ progname, save_dup);
+ exit(1);
+ }
+ }
+ free(save_dup);
/* complain if locale wasn't valid */
- if (res == NULL)
+ if (failure)
{
if (*locale)
- fprintf(stderr, _("%s: invalid locale name \"%s\"\n"),
- progname, locale);
+ {
+ if (category == LC_COLLATE)
+ fprintf(stderr, _("%s: invalid locale name \"%s\" (provider \"%s\")\n"),
+ progname, locale, get_collprovider_name(collprovider));
+ else
+ fprintf(stderr, _("%s: invalid locale name \"%s\"\n"),
+ progname, locale);
+ }
else
{
/*
@@ -2211,9 +2332,11 @@ check_locale_name(int category, const char *locale, char **canonname)
* check if the chosen encoding matches the encoding required by the locale
*
* this should match the similar check in the backend createdb() function
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
static bool
-check_locale_encoding(const char *locale, int user_enc)
+check_locale_encoding(const char *locale, int user_enc, char collprovider)
{
int locale_enc;
@@ -2240,6 +2363,25 @@ check_locale_encoding(const char *locale, int user_enc)
progname);
return false;
}
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ if (!is_encoding_supported_by_icu(user_enc))
+ {
+ fprintf(stderr, _("%s: selected encoding (%s) is not supported for ICU locales\n"),
+ progname, pg_encoding_to_char(user_enc));
+ return false;
+ }
+#else /* not USE_ICU */
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
+ progname);
+ exit(1);
+#endif /* not USE_ICU */
+ }
+
return true;
}
@@ -2251,16 +2393,22 @@ check_locale_encoding(const char *locale, int user_enc)
static void
setlocales(void)
{
- char *canonname;
-
- /* set empty lc_* values to locale config if set */
+ char *canonname = NULL;
if (locale)
{
+ /*
+ * Set up the collation provider if possible and canonicalize the locale
+ * name.
+ */
+ check_locale_collprovider(locale, &canonname, &collprovider, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ locale = canonname;
+
+ /* set empty lc_* values to locale config if set */
if (!lc_ctype)
lc_ctype = locale;
- if (!lc_collate)
- lc_collate = locale;
if (!lc_numeric)
lc_numeric = locale;
if (!lc_time)
@@ -2271,29 +2419,83 @@ setlocales(void)
lc_messages = locale;
}
+ if (lc_collate)
+ {
+ /*
+ * Set up the collation provider if possible and canonicalize the locale
+ * name.
+ */
+ check_locale_collprovider(lc_collate, &canonname, &collprovider, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ lc_collate = canonname;
+ }
+ else if (canonname)
+ {
+ /* we have already canonicalized the locale name */
+ lc_collate = pstrdup(canonname);
+ }
+
/*
* canonicalize locale names, and obtain any missing values from our
* current environment
*/
- check_locale_name(LC_CTYPE, lc_ctype, &canonname);
+ check_locale_name(LC_CTYPE, lc_ctype, &canonname, '\0');
lc_ctype = canonname;
- check_locale_name(LC_COLLATE, lc_collate, &canonname);
+
+ /* we always check lc_collate for libc */
+ check_locale_name(LC_COLLATE, lc_collate, &canonname, COLLPROVIDER_LIBC);
+ if (lc_collate)
+ pfree(lc_collate);
lc_collate = canonname;
- check_locale_name(LC_NUMERIC, lc_numeric, &canonname);
+
+ /* determine the collation provider if we haven't already done it */
+ if (!is_valid_nondefault_collprovider(collprovider))
+ {
+#ifdef USE_ICU
+ if (!locale_is_c(lc_collate))
+ {
+ collprovider = COLLPROVIDER_ICU;
+ }
+ else
+#endif
+ {
+ collprovider = COLLPROVIDER_LIBC;
+ }
+ }
+
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ check_locale_name(LC_COLLATE, lc_collate, NULL, collprovider);
+ if (strcmp(lc_collate, lc_ctype) != 0)
+ {
+ fprintf(stderr,
+ _("%s: collations with different collate and ctype values are not supported by ICU\n"),
+ progname);
+ exit(1);
+ }
+ }
+
+ check_locale_name(LC_NUMERIC, lc_numeric, &canonname, '\0');
lc_numeric = canonname;
- check_locale_name(LC_TIME, lc_time, &canonname);
+ check_locale_name(LC_TIME, lc_time, &canonname, '\0');
lc_time = canonname;
- check_locale_name(LC_MONETARY, lc_monetary, &canonname);
+ check_locale_name(LC_MONETARY, lc_monetary, &canonname, '\0');
lc_monetary = canonname;
#if defined(LC_MESSAGES) && !defined(WIN32)
- check_locale_name(LC_MESSAGES, lc_messages, &canonname);
+ check_locale_name(LC_MESSAGES, lc_messages, &canonname, '\0');
lc_messages = canonname;
#else
/* when LC_MESSAGES is not available, use the LC_CTYPE setting */
- check_locale_name(LC_CTYPE, lc_messages, &canonname);
+ check_locale_name(LC_CTYPE, lc_messages, &canonname, '\0');
lc_messages = canonname;
#endif
+
+ set_collation_version();
}
/*
@@ -2510,6 +2712,9 @@ setup_locale_encoding(void)
lc_time);
}
+ printf(_("The default collation provider is \"%s\".\n"),
+ get_collprovider_name(collprovider));
+
if (!encoding)
{
int ctype_enc;
@@ -2560,8 +2765,8 @@ setup_locale_encoding(void)
else
encodingid = get_encoding_id(encoding);
- if (!check_locale_encoding(lc_ctype, encodingid) ||
- !check_locale_encoding(lc_collate, encodingid))
+ if (!check_locale_encoding(lc_ctype, encodingid, '\0') ||
+ !check_locale_encoding(lc_collate, encodingid, collprovider))
exit(1); /* check_locale_encoding printed the error */
}
@@ -3321,3 +3526,113 @@ main(int argc, char *argv[])
return 0;
}
+
+#ifdef USE_ICU
+/*
+ * If locale is "" return the environment value from setlocale().
+ *
+ * Otherwise return a malloc'd copy of locale if it is not NULL.
+ *
+ * This should match the backend's check_icu_locale() function.
+ */
+static char *
+check_icu_locale_name(const char *locale)
+{
+ char *canonname = NULL;
+ char *winlocale = NULL;
+ char *result;
+
+ /* Windows locales can be in the format ".codepage" */
+ if (locale && (strlen(locale) == 0 || locale[0] == '.'))
+ {
+ check_locale_name(LC_COLLATE, locale, &canonname, COLLPROVIDER_LIBC);
+ locale = (const char *) canonname;
+ }
+
+#ifdef WIN32
+ if (!locale_is_c(locale))
+ {
+ winlocale = check_icu_winlocale(locale);
+
+ if (winlocale == NULL && locale != NULL)
+ exit(1); /* check_icu_winlocale printed the error */
+ else
+ locale = winlocale;
+ }
+#endif
+
+ result = locale ? pstrdup(locale) : NULL;
+
+ if (canonname)
+ pfree(canonname);
+ if (winlocale)
+ pfree(winlocale);
+
+ return result;
+}
+#endif /* USE_ICU */
+
+/*
+ * Setup the lc_collate version (get it from the collation provider).
+ */
+static void
+set_collation_version(void)
+{
+ char *wincollate = NULL;
+ char *langtag = NULL;
+ const char *collate;
+ bool failure;
+
+ Assert(lc_collate);
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ collate = (const char *) lc_collate;
+
+#ifdef WIN32
+ if (!locale_is_c(collate))
+ {
+ wincollate = check_icu_winlocale(collate);
+
+ if (wincollate == NULL && collate != NULL)
+ exit(1); /* check_icu_winlocale printed the error */
+ else
+ collate = (const char *) wincollate;
+ }
+#endif /* WIN32 */
+
+ langtag = get_icu_language_tag(collate);
+ if (!langtag)
+ {
+ /* get_icu_language_tag printed the main error message */
+ fprintf(stderr, _("Rerun %s with a different locale selection.\n"),
+ progname);
+ exit(1);
+ }
+ collate = get_icu_collate(collate, langtag);
+#else /* not USE_ICU */
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
+ progname);
+ exit(1);
+#endif /* not USE_ICU */
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ collate = (const char *) lc_collate;
+ }
+
+ get_collation_actual_version(collprovider, collate, &collversion, &failure);
+ if (failure)
+ /* get_collation_actual_version printed the error */
+ exit(1);
+
+ if (langtag)
+ free(langtag);
+ if (wincollate)
+ free(wincollate);
+}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8ca83c0..ca3b138 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -47,12 +47,14 @@
#include "catalog/pg_attribute.h"
#include "catalog/pg_cast.h"
#include "catalog/pg_class.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_default_acl.h"
#include "catalog/pg_largeobject.h"
#include "catalog/pg_largeobject_metadata.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_trigger.h"
#include "catalog/pg_type.h"
+#include "common/pg_collation_fn_common.h"
#include "libpq/libpq-fs.h"
#include "dumputils.h"
@@ -13420,9 +13422,10 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
int i_collprovider;
int i_collcollate;
int i_collctype;
- const char *collprovider;
+ const char *collproviderstr;
const char *collcollate;
const char *collctype;
+ const char *collprovider_name;
/* Skip if not to be dumped */
if (!collinfo->dobj.dump || dopt->dataOnly)
@@ -13462,11 +13465,21 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
i_collcollate = PQfnumber(res, "collcollate");
i_collctype = PQfnumber(res, "collctype");
- collprovider = PQgetvalue(res, 0, i_collprovider);
+ collproviderstr = PQgetvalue(res, 0, i_collprovider);
collcollate = PQgetvalue(res, 0, i_collcollate);
collctype = PQgetvalue(res, 0, i_collctype);
/*
+ * Use COLLPROVIDER_DEFAULT to allow dumping pg_catalog; not accepted on
+ * input
+ */
+ collprovider_name = get_collprovider_name(collproviderstr[0]);
+ if (!collprovider_name)
+ exit_horribly(NULL,
+ "unrecognized collation provider: %s\n",
+ collproviderstr);
+
+ /*
* DROP must be fully qualified in case same name appears in pg_catalog
*/
appendPQExpBuffer(delq, "DROP COLLATION %s",
@@ -13477,18 +13490,7 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
appendPQExpBuffer(q, "CREATE COLLATION %s (",
fmtId(collinfo->dobj.name));
- appendPQExpBufferStr(q, "provider = ");
- if (collprovider[0] == 'c')
- appendPQExpBufferStr(q, "libc");
- else if (collprovider[0] == 'i')
- appendPQExpBufferStr(q, "icu");
- else if (collprovider[0] == 'd')
- /* to allow dumping pg_catalog; not accepted on input */
- appendPQExpBufferStr(q, "default");
- else
- exit_horribly(NULL,
- "unrecognized collation provider: %s\n",
- collprovider);
+ appendPQExpBuffer(q, "provider = %s", collprovider_name);
if (strcmp(collcollate, collctype) == 0)
{
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 466a780..1581bf8 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -16,7 +16,9 @@
#include "catalog/pg_attribute.h"
#include "catalog/pg_class.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_default_acl.h"
+#include "common/pg_collation_fn_common.h"
#include "fe_utils/string_utils.h"
#include "common.h"
@@ -3967,7 +3969,13 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
if (pset.sversion >= 100000)
appendPQExpBuffer(&buf,
- ",\n CASE c.collprovider WHEN 'd' THEN 'default' WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\"",
+ ",\n CASE c.collprovider WHEN '%c' THEN '%s' WHEN '%c' THEN '%s' WHEN '%c' THEN '%s' END AS \"%s\"",
+ COLLPROVIDER_DEFAULT,
+ get_collprovider_name(COLLPROVIDER_DEFAULT),
+ COLLPROVIDER_LIBC,
+ get_collprovider_name(COLLPROVIDER_LIBC),
+ COLLPROVIDER_ICU,
+ get_collprovider_name(COLLPROVIDER_ICU),
gettext_noop("Provider"));
if (verbose)
diff --git a/src/bin/scripts/Makefile b/src/bin/scripts/Makefile
index 0cc528e..35c7ff9 100644
--- a/src/bin/scripts/Makefile
+++ b/src/bin/scripts/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
PROGRAMS = createdb createuser dropdb dropuser clusterdb vacuumdb reindexdb pg_isready
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
-override LDFLAGS := -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(LDFLAGS)
+override LDFLAGS := -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS) $(LDFLAGS)
all: $(PROGRAMS)
diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index 81a8192..8d89d2b 100644
--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -58,6 +58,7 @@ main(int argc, char *argv[])
char *lc_collate = NULL;
char *lc_ctype = NULL;
char *locale = NULL;
+ char *canonname = NULL;
PQExpBufferData sql;
@@ -153,7 +154,15 @@ main(int argc, char *argv[])
progname);
exit(1);
}
- lc_ctype = locale;
+
+ /*
+ * remove the collation provider modifier from the locale for lc_ctype
+ */
+ check_locale_collprovider(locale, &canonname, NULL, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ lc_ctype = canonname;
+
lc_collate = locale;
}
@@ -241,6 +250,9 @@ main(int argc, char *argv[])
PQfinish(conn);
+ if (canonname)
+ pfree(canonname);
+
exit(0);
}
diff --git a/src/common/Makefile b/src/common/Makefile
index 80e78d7..4fbe0f0 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -43,7 +43,7 @@ override CPPFLAGS += -DVAL_LIBS="\"$(LIBS)\""
OBJS_COMMON = base64.o config_info.o controldata_utils.o exec.o ip.o \
keywords.o md5.o pg_lzcompress.o pgfnames.o psprintf.o relpath.o \
rmtree.o saslprep.o scram-common.o string.o unicode_norm.o \
- username.o wait_error.o
+ username.o wait_error.o pg_collation_fn_common.o
ifeq ($(with_openssl),yes)
OBJS_COMMON += sha2_openssl.o
diff --git a/src/common/pg_collation_fn_common.c b/src/common/pg_collation_fn_common.c
new file mode 100644
index 0000000..a3ba3a3
--- /dev/null
+++ b/src/common/pg_collation_fn_common.c
@@ -0,0 +1,90 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_collation_fn_common.c
+ * commmon routines to support manipulation of the pg_collation relation
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/common/pg_collation_fn_common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifdef FRONTEND
+#include "postgres_fe.h"
+#else
+#include "postgres.h"
+#endif
+
+#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
+
+
+/*
+ * Note that we search the table with pg_strcasecmp(), so variant
+ * capitalizations don't need their own entries.
+ */
+typedef struct collprovider_name
+{
+ char collprovider;
+ const char *name;
+} collprovider_name;
+
+static const collprovider_name collprovider_name_tbl[] =
+{
+ {COLLPROVIDER_DEFAULT, "default"},
+ {COLLPROVIDER_LIBC, "libc"},
+ {COLLPROVIDER_ICU, "icu"},
+ {'\0', NULL} /* end marker */
+};
+
+/*
+ * Get the collation provider from the given collation provider name.
+ *
+ * Return '\0' if we can't determine it.
+ */
+char
+get_collprovider(const char *name)
+{
+ int i;
+
+ if (!name)
+ return '\0';
+
+ /* Check the table */
+ for (i = 0; collprovider_name_tbl[i].name; ++i)
+ if (pg_strcasecmp(name, collprovider_name_tbl[i].name) == 0)
+ return collprovider_name_tbl[i].collprovider;
+
+ return '\0';
+}
+
+/*
+ * Get the name of the given collation provider.
+ *
+ * Return NULL if we can't determine it.
+ */
+const char *
+get_collprovider_name(char collprovider)
+{
+ int i;
+
+ /* Check the table */
+ for (i = 0; collprovider_name_tbl[i].collprovider; ++i)
+ if (collprovider_name_tbl[i].collprovider == collprovider)
+ return collprovider_name_tbl[i].name;
+
+ return NULL;
+}
+
+/*
+ * Return true if collation provider is nondefault and valid, and false otherwise.
+ */
+bool
+is_valid_nondefault_collprovider(char collprovider)
+{
+ return (collprovider == COLLPROVIDER_LIBC ||
+ collprovider == COLLPROVIDER_ICU);
+}
diff --git a/src/fe_utils/.gitignore b/src/fe_utils/.gitignore
index 37f5f75..b14041b 100644
--- a/src/fe_utils/.gitignore
+++ b/src/fe_utils/.gitignore
@@ -1 +1,2 @@
/psqlscan.c
+/pg_collation_fn_common.c
diff --git a/src/fe_utils/Makefile b/src/fe_utils/Makefile
index 3f4ba8b..4bdfd17 100644
--- a/src/fe_utils/Makefile
+++ b/src/fe_utils/Makefile
@@ -19,7 +19,8 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
-OBJS = mbprint.o print.o psqlscan.o simple_list.o string_utils.o
+OBJS = mbprint.o print.o psqlscan.o simple_list.o string_utils.o \
+ pg_collation_fn_common.o
all: libpgfeutils.a
@@ -33,6 +34,13 @@ psqlscan.c: FLEX_FIX_WARNING=yes
distprep: psqlscan.c
+# Pull in pg_collation_fn_common.c from src/common. That exposes us to
+# risks of version skew if we link to a shared library. Do it the
+# hard way, instead, so that we're statically linked.
+
+pg_collation_fn_common.c: % : $(top_srcdir)/src/common/%
+ rm -f $@ && $(LN_S) $< .
+
# libpgfeutils could be useful to contrib, so install it
install: all installdirs
$(INSTALL_STLIB) libpgfeutils.a '$(DESTDIR)$(libdir)/libpgfeutils.a'
@@ -45,6 +53,7 @@ uninstall:
clean distclean:
rm -f libpgfeutils.a $(OBJS) lex.backup
+ rm -f pg_collation_fn_common.c
# psqlscan.c is supposed to be in the distribution tarball,
# so do not clean it in the clean/distclean rules
diff --git a/src/include/commands/dbcommands.h b/src/include/commands/dbcommands.h
index 677c7fc..d1b2776 100644
--- a/src/include/commands/dbcommands.h
+++ b/src/include/commands/dbcommands.h
@@ -29,6 +29,7 @@ extern ObjectAddress AlterDatabaseOwner(const char *dbname, Oid newOwnerId);
extern Oid get_database_oid(const char *dbname, bool missingok);
extern char *get_database_name(Oid dbid);
-extern void check_encoding_locale_matches(int encoding, const char *collate, const char *ctype);
+extern void check_encoding_locale_matches(int encoding, const char *collate, const char *ctype,
+ char collprovider);
#endif /* DBCOMMANDS_H */
diff --git a/src/include/common/pg_collation_fn_common.h b/src/include/common/pg_collation_fn_common.h
new file mode 100644
index 0000000..f05778d
--- /dev/null
+++ b/src/include/common/pg_collation_fn_common.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_collation_fn_common.h
+ * prototypes for functions in common/pg_collation_fn_common.c
+ *
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/pg_collation_fn_common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef PG_COLLATION_FN_COMMON_H
+#define PG_COLLATION_FN_COMMON_H
+
+extern char get_collprovider(const char *name);
+extern const char *get_collprovider_name(char collprovider);
+extern bool is_valid_nondefault_collprovider(char collprovider);
+
+#endif /* PG_COLLATION_FN_COMMON_H */
diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32
index 22d19ed..de91567 100644
--- a/src/include/pg_config.h.win32
+++ b/src/include/pg_config.h.win32
@@ -623,6 +623,10 @@
/* Define to use /dev/urandom for random number generation */
/* #undef USE_DEV_URANDOM */
+/* Define to build with ICU support. (--with-icu) */
+/* #undef USE_ICU */
+
+
/* Define to 1 to build with LDAP support. (--with-ldap) */
/* #undef USE_LDAP */
diff --git a/src/include/port.h b/src/include/port.h
index 3e528fa..3ed7d11 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -419,6 +419,40 @@ extern int pg_get_encoding_from_locale(const char *ctype, bool write_message);
extern int pg_codepage_to_encoding(UINT cp);
#endif
+/* do not make libpq with icu */
+#ifndef LIBPQ_MAKE
+
+extern void check_locale_collprovider(const char *locale, char **canonname,
+ char *collprovider, char **collversion);
+extern bool locale_is_c(const char *locale);
+extern char *get_full_collation_name(const char *locale, char collprovider,
+ const char *collversion);
+
+#ifdef FRONTEND
+extern void get_collation_actual_version(char collprovider,
+ const char *collcollate,
+ char **collversion, bool *failure);
+#else
+extern char *get_collation_actual_version(char collprovider,
+ const char *collcollate);
+#endif
+
+#ifdef USE_ICU
+#define ICU_ROOT_LOCALE "root"
+
+/* Users of this must import unicode/ucol.h too. */
+struct UCollator;
+extern struct UCollator *open_collator(const char *collate);
+
+extern char * get_icu_language_tag(const char *localename);
+extern const char *get_icu_collate(const char *locale, const char *langtag);
+#ifdef WIN32
+extern char * check_icu_winlocale(const char *winlocale);
+#endif /* WIN32 */
+#endif /* USE_ICU */
+
+#endif /* not LIBPQ_MAKE */
+
/* port/inet_net_ntop.c */
extern char *inet_net_ntop(int af, const void *src, int bits,
char *dst, size_t size);
diff --git a/src/include/port/win32.h b/src/include/port/win32.h
index 9f48a58..7e3e7e5 100644
--- a/src/include/port/win32.h
+++ b/src/include/port/win32.h
@@ -16,7 +16,7 @@
* get support for GetLocaleInfoEx() with locales. For everything else
* the minimum version is Windows XP (0x0501).
*/
-#if defined(_MSC_VER) && _MSC_VER >= 1900
+#if defined(_MSC_VER) && _MSC_VER >= 1800
#define MIN_WINNT 0x0600
#else
#define MIN_WINNT 0x0501
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 88a3134..161a14e 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -57,8 +57,10 @@ extern void assign_locale_numeric(const char *newval, void *extra);
extern bool check_locale_time(char **newval, void **extra, GucSource source);
extern void assign_locale_time(const char *newval, void *extra);
-extern bool check_locale(int category, const char *locale, char **canonname);
-extern char *pg_perm_setlocale(int category, const char *locale);
+extern bool check_locale(int category, const char *locale, char **canonname,
+ char collprovider);
+extern const char *pg_perm_setlocale(int category, const char *locale,
+ char collprovider);
extern void check_strxfrm_bug(void);
extern bool lc_collate_is_c(Oid collation);
@@ -102,11 +104,11 @@ typedef struct pg_locale_struct *pg_locale_t;
extern pg_locale_t pg_newlocale_from_collation(Oid collid);
-extern char *get_collation_actual_version(char collprovider, const char *collcollate);
-
#ifdef USE_ICU
extern int32_t icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes);
extern int32_t icu_from_uchar(char **result, const UChar *buff_uchar, int32_t len_uchar);
+extern const char *get_icu_default_collate(void);
+extern UCollator *get_default_collation_collator(void);
#endif
/* These functions convert from/to libc's wchar_t, *not* pg_wchar_t */
@@ -115,4 +117,6 @@ extern size_t wchar2char(char *to, const wchar_t *from, size_t tolen,
extern size_t char2wchar(wchar_t *to, size_t tolen,
const char *from, size_t fromlen, pg_locale_t locale);
+extern char get_default_collprovider(void);
+
#endif /* _PG_LOCALE_ */
diff --git a/src/interfaces/libpq/.gitignore b/src/interfaces/libpq/.gitignore
index 5c232ae..212edd9 100644
--- a/src/interfaces/libpq/.gitignore
+++ b/src/interfaces/libpq/.gitignore
@@ -32,3 +32,4 @@
/unicode_norm.c
/encnames.c
/wchar.c
+/pg_collation_fn_common.c
diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile
index abe0a50..32a5d43 100644
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -19,7 +19,7 @@ NAME= pq
SO_MAJOR_VERSION= 5
SO_MINOR_VERSION= $(MAJORVERSION)
-override CPPFLAGS := -DFRONTEND -DUNSAFE_STAT_OK -I$(srcdir) $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port
+override CPPFLAGS := -DFRONTEND -DUNSAFE_STAT_OK -I$(srcdir) $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port -DLIBPQ_MAKE
ifneq ($(PORTNAME), win32)
override CFLAGS += $(PTHREAD_CFLAGS)
endif
diff --git a/src/port/chklocale.c b/src/port/chklocale.c
index dde9130..a30bded 100644
--- a/src/port/chklocale.c
+++ b/src/port/chklocale.c
@@ -23,8 +23,26 @@
#include <langinfo.h>
#endif
+#ifdef USE_ICU
+#include <unicode/ucol.h>
+#endif
+
+#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
+/*
+ * In backend, we will use palloc/pfree. In frontend, use malloc/free.
+ */
+#ifndef FRONTEND
+#define STRDUP(s) pstrdup(s)
+#define ALLOC(size) palloc(size)
+#define FREE(s) pfree(s)
+#else
+#define STRDUP(s) strdup(s)
+#define ALLOC(size) malloc(size)
+#define FREE(s) free(s)
+#endif
/*
* This table needs to recognize all the CODESET spellings for supported
@@ -436,3 +454,583 @@ pg_get_encoding_from_locale(const char *ctype, bool write_message)
}
#endif /* (HAVE_LANGINFO_H && CODESET) || WIN32 */
+
+/* do not make libpq with icu */
+#ifndef LIBPQ_MAKE
+
+/*
+ * Check if the locale contains the modifier of the collation provider.
+ *
+ * Set up the collation provider according to the appropriate modifier or '\0'.
+ * Set up the collation version to NULL if we don't find it after the collation
+ * provider modifier.
+ *
+ * The malloc'd copy of the locale's canonical name without the modifier of the
+ * collation provider and the collation version is stored in the canonname if
+ * locale is not NULL. The canoname can have the zero length.
+ */
+void
+check_locale_collprovider(const char *locale, char **canonname,
+ char *collprovider, char **collversion)
+{
+ const char *modifier_sign,
+ *dot_sign,
+ *cur_collprovider_end;
+ char cur_collprovider_name[NAMEDATALEN];
+ int cur_collprovider_len;
+ char cur_collprovider;
+
+ /* in case of failure or if we don't find them in the locale name */
+ if (canonname)
+ *canonname = NULL;
+ if (collprovider)
+ *collprovider = '\0';
+ if (collversion)
+ *collversion = NULL;
+
+ if (!locale)
+ return;
+
+ /* find the last occurrence of the modifier sign '@' in the locale */
+ modifier_sign = strrchr(locale, '@');
+
+ if (!modifier_sign)
+ {
+ /* just copy all the name */
+ if (canonname)
+ *canonname = STRDUP(locale);
+ return;
+ }
+
+ /* check if there's a version after the collation provider modifier */
+ if ((dot_sign = strchr(modifier_sign, '.')) == NULL)
+ cur_collprovider_end = &locale[strlen(locale)];
+ else
+ cur_collprovider_end = dot_sign;
+
+ cur_collprovider_len = cur_collprovider_end - modifier_sign - 1;
+ if (cur_collprovider_len + 1 > NAMEDATALEN)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("collation provider name is too long: %s"), locale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("collation provider name is too long: %s", locale)));
+#endif /* not FRONTEND */
+ return;
+ }
+
+ strncpy(cur_collprovider_name, modifier_sign + 1, cur_collprovider_len);
+ cur_collprovider_name[cur_collprovider_len] = '\0';
+
+ /* check if this is a valid collprovider name */
+ cur_collprovider = get_collprovider(cur_collprovider_name);
+ if (is_valid_nondefault_collprovider(cur_collprovider))
+ {
+ if (collprovider)
+ *collprovider = cur_collprovider;
+
+ if (canonname)
+ {
+ int canonname_len = modifier_sign - locale;
+
+ *canonname = ALLOC((canonname_len + 1) * sizeof(char));
+ if (*canonname)
+ {
+ strncpy(*canonname, locale, canonname_len);
+ (*canonname)[canonname_len] = '\0';
+ }
+ else
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("out of memory"));
+ /*
+ * keep newline separate so there's only one translatable string
+ */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR, (errmsg("out of memory")));
+#endif /* not FRONTEND */
+ }
+ }
+
+ if (dot_sign && collversion)
+ *collversion = STRDUP(dot_sign + 1);
+ }
+ else
+ {
+ /* just copy all the name */
+ if (canonname)
+ *canonname = STRDUP(locale);
+ }
+}
+
+/*
+ * Return true if locale is "C" or "POSIX";
+ */
+bool
+locale_is_c(const char *locale)
+{
+ return locale && (strcmp(locale, "C") == 0 || strcmp(locale, "POSIX") == 0);
+}
+
+/*
+ * Return locale ended with collation provider modifier and collation version.
+ *
+ * Return NULL if locale is NULL.
+ */
+char *
+get_full_collation_name(const char *locale, char collprovider,
+ const char *collversion)
+{
+ char *new_locale;
+ int old_len,
+ len_with_provider,
+ new_len;
+ const char *collprovider_name;
+
+ if (!locale)
+ return NULL;
+
+ collprovider_name = get_collprovider_name(collprovider);
+ Assert(collprovider_name);
+
+ old_len = strlen(locale);
+ new_len = len_with_provider = old_len + 1 + strlen(collprovider_name);
+ if (collversion && *collversion)
+ new_len += 1 + strlen(collversion);
+
+ new_locale = ALLOC((new_len + 1) * sizeof(char));
+ if (!new_locale)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("out of memory"));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR, (errmsg("out of memory")));
+#endif /* not FRONTEND */
+
+ return NULL;
+ }
+
+ /* add the collation provider modifier */
+ strcpy(new_locale, locale);
+ new_locale[old_len] = '@';
+ strcpy(&new_locale[old_len + 1], collprovider_name);
+
+ /* add the collation version if needed */
+ if (collversion && *collversion)
+ {
+ new_locale[len_with_provider] = '.';
+ strcpy(&new_locale[len_with_provider + 1], collversion);
+ }
+
+ new_locale[new_len] = '\0';
+
+ return new_locale;
+}
+
+/*
+ * Get provider-specific collation version string for the given collation from
+ * the operating system/library.
+ *
+ * A particular provider must always either return a non-NULL string or return
+ * NULL (if it doesn't support versions). It must not return NULL for some
+ * collcollate and not NULL for others.
+ */
+#ifdef FRONTEND
+void
+get_collation_actual_version(char collprovider, const char *collcollate,
+ char **collversion, bool *failure)
+{
+ if (failure)
+ *failure = false;
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ UCollator *collator = open_collator(collcollate);
+ UVersionInfo versioninfo;
+ char buf[U_MAX_VERSION_STRING_LENGTH];
+
+ if (collator)
+ {
+ ucol_getVersion(collator, versioninfo);
+ ucol_close(collator);
+
+ u_versionToString(versioninfo, buf);
+ if (collversion)
+ *collversion = STRDUP(buf);
+ }
+ else
+ {
+ if (collversion)
+ *collversion = NULL;
+ if (failure)
+ *failure = true;
+ }
+ }
+ else
+#endif
+ {
+ if (collversion)
+ *collversion = NULL;
+ }
+}
+#else /* not FRONTEND */
+char *
+get_collation_actual_version(char collprovider, const char *collcollate)
+{
+ char *collversion;
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ UCollator *collator = open_collator(collcollate);
+ UVersionInfo versioninfo;
+ char buf[U_MAX_VERSION_STRING_LENGTH];
+
+ ucol_getVersion(collator, versioninfo);
+ ucol_close(collator);
+
+ u_versionToString(versioninfo, buf);
+ collversion = STRDUP(buf);
+ }
+ else
+#endif
+ collversion = NULL;
+
+ return collversion;
+}
+#endif /* not FRONTEND */
+
+#ifdef USE_ICU
+/*
+ * Open the collator for this icu locale. Return NULL in case of failure.
+ */
+UCollator *
+open_collator(const char *collate)
+{
+ UCollator *collator;
+ UErrorCode status;
+ const char *save = uloc_getDefault();
+ char *save_dup;
+
+ if (!save)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: uloc_getDefault() failed"));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR, (errmsg("ICU error: uloc_getDefault() failed")));
+#endif
+ return NULL;
+ }
+
+ /* save may be pointing at a modifiable scratch variable, so copy it. */
+ save_dup = STRDUP(save);
+
+ /* set the default locale to root */
+ status = U_ZERO_ERROR;
+ uloc_setDefault(ICU_ROOT_LOCALE, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: failed to set the default locale to \"%s\": %s"),
+ ICU_ROOT_LOCALE, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to set the default locale to \"%s\": %s",
+ ICU_ROOT_LOCALE, u_errorName(status))));
+#endif
+ return NULL;
+ }
+
+ /* get a collator for this collate */
+ status = U_ZERO_ERROR;
+ collator = ucol_open(collate, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: could not open collator for locale \"%s\": %s"),
+ collate, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: could not open collator for locale \"%s\": %s",
+ collate, u_errorName(status))));
+#endif
+ collator = NULL;
+ }
+
+ /* restore old value of the default locale. */
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: failed to restore old locale \"%s\": %s"),
+ save_dup, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to restore old locale \"%s\": %s",
+ save_dup, u_errorName(status))));
+#endif
+ return NULL;
+ }
+ FREE(save_dup);
+
+ return collator;
+}
+
+/*
+ * Get the ICU language tag for a locale name.
+ * The result is a palloc'd string.
+ * Return NULL in case of failure or if localename is NULL.
+ */
+char *
+get_icu_language_tag(const char *localename)
+{
+ char buf[ULOC_FULLNAME_CAPACITY];
+ UErrorCode status = U_ZERO_ERROR;
+
+ if (!localename)
+ return NULL;
+
+ uloc_toLanguageTag(localename, buf, sizeof(buf), TRUE, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("ICU error: could not convert locale name \"%s\" to language tag: %s"),
+ localename, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: could not convert locale name \"%s\" to language tag: %s",
+ localename, u_errorName(status))));
+#endif
+ return NULL;
+ }
+ return STRDUP(buf);
+}
+
+/*
+ * Get the icu collation name.
+ */
+const char *
+get_icu_collate(const char *locale, const char *langtag)
+{
+ return U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : locale;
+}
+
+#ifdef WIN32
+/*
+ * Get the Language Code Identifier (LCID) for the Windows locale.
+ *
+ * Return zero in case of failure.
+ */
+static uint32
+get_lcid(const wchar_t *winlocale)
+{
+ /*
+ * The second argument to the LocaleNameToLCID function is:
+ * - Prior to Windows 7: reserved; should always be 0.
+ * - Beginning in Windows 7: use LOCALE_ALLOW_NEUTRAL_NAMES to allow the
+ * return of lcids of locales without regions.
+ */
+#if (NTDDI_VERSION >= NTDDI_WIN7)
+ return LocaleNameToLCID(winlocale, LOCALE_ALLOW_NEUTRAL_NAMES);
+#else
+ return LocaleNameToLCID(winlocale, 0);
+#endif
+}
+
+/*
+ * char2wchar_ascii --- convert multibyte characters to wide characters
+ *
+ * This is a simplified version of the char2wchar() function from backend.
+ */
+static size_t
+char2wchar_ascii(wchar_t *to, size_t tolen, const char *from, size_t fromlen)
+{
+ size_t result;
+
+ if (tolen == 0)
+ return 0;
+
+ /* Win32 API does not work for zero-length input */
+ if (fromlen == 0)
+ result = 0;
+ else
+ {
+ result = MultiByteToWideChar(CP_ACP, 0, from, fromlen, to, tolen - 1);
+ /* A zero return is failure */
+ if (result == 0)
+ result = -1;
+ }
+
+ if (result != -1)
+ {
+ Assert(result < tolen);
+ /* Append trailing null wchar (MultiByteToWideChar() does not) */
+ to[result] = 0;
+ }
+
+ return result;
+}
+
+/*
+ * Get the canonical ICU name for the Windows locale.
+ *
+ * Return a malloc'd string or NULL in case of failure.
+ */
+char *
+check_icu_winlocale(const char *winlocale)
+{
+ uint32 lcid;
+ char canonname_buf[ULOC_FULLNAME_CAPACITY];
+ UErrorCode status = U_ZERO_ERROR;
+#if (_MSC_VER >= 1400) /* VC8.0 or later */
+ _locale_t loct = NULL;
+#endif
+
+ if (winlocale == NULL)
+ return NULL;
+
+ /* Get the Language Code Identifier (LCID). */
+
+#if (_MSC_VER >= 1400) /* VC8.0 or later */
+ loct = _create_locale(LC_COLLATE, winlocale);
+
+ if (loct != NULL)
+ {
+#if (_MSC_VER >= 1700) /* Visual Studio 2012 or later */
+ if ((lcid = get_lcid(loct->locinfo->locale_name[LC_COLLATE])) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif /* not FRONTEND */
+ _free_locale(loct);
+ return NULL;
+ }
+#else /* _MSC_VER >= 1400 && _MSC_VER < 1700 */
+ if ((lcid = loct->locinfo->lc_handle[LC_COLLATE]) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif /* not FRONTEND */
+ _free_locale(loct);
+ return NULL;
+ }
+#endif /* _MSC_VER >= 1400 && _MSC_VER < 1700 */
+ _free_locale(loct);
+ }
+ else
+#endif /* VC8.0 or later */
+ {
+ if (strlen(winlocale) == 0)
+ {
+ lcid = LOCALE_USER_DEFAULT;
+ }
+ else
+ {
+ size_t locale_len = strlen(winlocale);
+ wchar_t *wlocale = (wchar_t*) ALLOC(
+ (locale_len + 1) * sizeof(wchar_t));
+ /* Locale names use only ASCII */
+ size_t locale_wlen = char2wchar_ascii(wlocale, locale_len + 1,
+ winlocale, locale_len);
+ if (locale_wlen == -1)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to convert locale \"%s\" to wide characters"),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("failed to convert locale \"%s\" to wide characters",
+ winlocale)));
+#endif
+ FREE(wlocale);
+ return NULL;
+ }
+
+ if ((lcid = get_lcid(wlocale)) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif
+ FREE(wlocale);
+ return NULL;
+ }
+
+ FREE(wlocale);
+ }
+ }
+
+ /* Get the ICU canoname. */
+
+ uloc_getLocaleForLCID(lcid, canonname_buf, sizeof(canonname_buf), &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("ICU error: failed to get the locale name for LCID 0x%04x: %s"),
+ lcid, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to get the locale name for LCID 0x%04x: %s",
+ lcid, u_errorName(status))));
+#endif
+ return NULL;
+ }
+
+ return STRDUP(canonname_buf);
+}
+#endif /* WIN32 */
+#endif /* USE_ICU */
+
+#endif /* not LIBPQ_MAKE */
diff --git a/src/test/Makefile b/src/test/Makefile
index 73abf16..259bb1f 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
top_builddir = ../..
include $(top_builddir)/src/Makefile.global
-SUBDIRS = perl regress isolation modules authentication recovery subscription
+SUBDIRS = perl regress isolation modules authentication recovery subscription default_collation
# We don't build or execute examples/, locale/, or thread/ by default,
# but we do want "make clean" etc to recurse into them. Likewise for
diff --git a/src/test/default_collation/Makefile b/src/test/default_collation/Makefile
new file mode 100644
index 0000000..2efe8be
--- /dev/null
+++ b/src/test/default_collation/Makefile
@@ -0,0 +1,28 @@
+# src/test/default_collation/Makefile
+
+subdir = src/test/default_collation
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+ifeq ($(with_icu),yes)
+check:
+ $(MAKE) -C icu check
+check-utf8:
+ $(MAKE) -C icu.utf8 check
+ $(MAKE) -C libc.utf8 check
+else
+check:
+ $(MAKE) -C libc check
+check-utf8:
+ $(MAKE) -C libc.utf8 check
+endif
+
+# We don't check libc/ if with_icu or vice versa, but we do want "make clean" to
+# recurse into it. The same goes for libc.utf8/ or icu.utf8/, which we don't
+# check by default.
+ALWAYS_SUBDIRS = libc libc.utf8 icu icu.utf8
+
+clean distclean maintainer-clean:
+ for d in $(ALWAYS_SUBDIRS); do \
+ $(MAKE) -C $$d clean || exit; \
+ done
diff --git a/src/test/default_collation/icu.utf8/.gitignore b/src/test/default_collation/icu.utf8/.gitignore
new file mode 100644
index 0000000..871e943
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/icu.utf8/Makefile b/src/test/default_collation/icu.utf8/Makefile
new file mode 100644
index 0000000..7adecfd
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/icu.utf8/Makefile
+
+subdir = src/test/default_collation/icu.utf8
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/icu.utf8/t/001_default_collation.pl b/src/test/default_collation/icu.utf8/t/001_default_collation.pl
new file mode 100644
index 0000000..617c06d
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/t/001_default_collation.pl
@@ -0,0 +1,799 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 188;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"$expected_collprovider\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+sub psql
+{
+ my ($command, $db) = @_;
+ my ($result, $in, $out, $err);
+ my @psql = ('psql', '-X', '-c', $command);
+ if (defined($db))
+ {
+ push(@psql, $db);
+ }
+ print "# Running: " . join(" ", @psql) . "\n";
+ $result = IPC::Run::run \@psql, \$in, \$out, \$err;
+ ($result, $out, $err);
+}
+
+# --locale
+
+test_initdb(
+ "en_US.utf8 locale",
+ "--locale=en_US.utf8",
+ "icu",
+ "");
+
+test_initdb(
+ "en_US.utf8 locale with C ctype",
+ "--locale=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_initdb(
+ "be_BY\@latin icu locale",
+ "--locale=be_BY\@latin\@icu",
+ "icu",
+ "");
+
+test_initdb(
+ "be_BY\@latin icu locale invalid modifier order",
+ "--locale=be_BY\@icu\@latin",
+ "",
+ "invalid locale name \"be_BY\@icu\@latin\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_initdb(
+ "en_US.utf8 lc_collate",
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8",
+ "icu",
+ "");
+
+test_initdb(
+ "en_US.utf8 lc_collate with C ctype",
+ "--lc-collate=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_initdb(
+ "be_BY\@latin icu lc_collate",
+ "--lc-collate=be_BY\@latin\@icu --lc-ctype=be_BY\@latin",
+ "icu",
+ "");
+
+test_initdb(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@icu\@latin",
+ "",
+ "invalid locale name \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb",
+ split(" ", $options),
+ "--template=template0",
+ "mydb");
+
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name,
+ $options,
+ $expected_collprovider,
+ $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ ($result, $out_command, $err_command) = psql(
+ "create database mydb "
+ . $options
+ . " template = template0;");
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_default_collation
+{
+ my ($createdb_options, $collation, $expected_collprovider, @commands) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb", split(" ", $createdb_options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "\"@command\" check output");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "\"@command\" check output");
+ }
+
+ for (my $row = 0; $row <= $#commands; $row++)
+ {
+ my ($command_text, $expected) = @{$commands[$row]};
+ ($result, $out_command, $err_command) = psql($command_text, "mydb");
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ if ($out_command)
+ {
+ is(
+ $out_command,
+ $expected,
+ "default collation "
+ . $collation
+ . ": \""
+ . $command_text
+ . "\" check output");
+ }
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+my @command = ("createuser --createdb --no-superuser non_superuser");
+print "# Running: " . join(" ", @command) . "\n";
+system(@command);
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "en_US.utf8 locale",
+ "--locale=en_US.utf8",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu locale",
+ "--locale=be_BY\@latin\@icu",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu locale invalid modifier order",
+ "--locale=be_BY\@icu\@latin",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_createdb(
+ "en_US.utf8 lc_collate",
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8",
+ "icu",
+ "");
+
+test_createdb(
+ "en_US.utf8 lc_collate with C ctype",
+ "--lc-collate=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_createdb(
+ "be_BY\@latin icu lc_collate",
+ "--lc-collate=be_BY\@latin\@icu --lc-ctype=be_BY\@latin",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@icu\@latin",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+test_create_database(
+ "en_US.utf8 lc_collate",
+ "LC_COLLATE = 'en_US.utf8' LC_CTYPE = 'en_US.utf8'",
+ "icu",
+ "");
+
+test_create_database(
+ "en_US.utf8 lc_collate with C ctype",
+ "LC_COLLATE = 'en_US.utf8' LC_CTYPE = 'C'",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_create_database(
+ "be_BY\@latin icu lc_collate",
+ "LC_COLLATE = 'be_BY\@latin' LC_CTYPE = 'be_BY\@latin'",
+ "icu",
+ "");
+
+test_create_database(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "LC_COLLATE = 'be_BY\@icu\@latin'",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test default collation behaviour
+# use commands and outputs from the regression test collate.icu.utf8
+
+test_default_collation(
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8 --template=template0",
+ "en_US.utf8\@icu",
+ "icu",
+ (
+ [
+ "CREATE TABLE collate_test1 (a int, b text NOT NULL);",
+ "CREATE TABLE\n"
+ ],
+ [
+ "INSERT INTO collate_test1 VALUES "
+ . "(1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');",
+ "INSERT 0 4\n"],
+ [
+ "SELECT * FROM collate_test1 WHERE b >= 'bbc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 3 | bbc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # star expansion
+ [
+ "SELECT * FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # upper/lower
+ ["CREATE TABLE collate_test10 (a int, x text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test10 VALUES (1, 'hij'), (2, 'HIJ');",
+ "INSERT 0 2\n"
+ ],
+ [
+ "SELECT a, lower(x), upper(x), initcap(x) FROM collate_test10;",
+ " a | lower | upper | initcap \n"
+ . "---+-------+-------+---------\n"
+ . " 1 | hij | HIJ | Hij\n"
+ . " 2 | hij | HIJ | Hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ # LIKE/ILIKE
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ILIKE '%KI%' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ILIKE 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ # regular expressions
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE TABLE collate_test6 (a int, b text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test6 VALUES "
+ . "(1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'), "
+ . "(5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, ' '), "
+ . "(9, 'äbç'), (10, 'ÄBÇ');",
+ "INSERT 0 10\n"
+ ],
+ [
+ "SELECT b, "
+ . "b ~ '^[[:alpha:]]+\$' AS is_alpha, "
+ . "b ~ '^[[:upper:]]+\$' AS is_upper, "
+ . "b ~ '^[[:lower:]]+\$' AS is_lower, "
+ . "b ~ '^[[:digit:]]+\$' AS is_digit, "
+ . "b ~ '^[[:alnum:]]+\$' AS is_alnum, "
+ . "b ~ '^[[:graph:]]+\$' AS is_graph, "
+ . "b ~ '^[[:print:]]+\$' AS is_print, "
+ . "b ~ '^[[:punct:]]+\$' AS is_punct, "
+ . "b ~ '^[[:space:]]+\$' AS is_space "
+ . "FROM collate_test6;",
+ " b | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space \n"
+ . "-----+----------+----------+----------+----------+----------+----------+----------+----------+----------\n"
+ . " abc | t | f | t | f | t | t | t | f | f\n"
+ . " ABC | t | t | f | f | t | t | t | f | f\n"
+ . " 123 | f | f | f | t | t | t | t | f | f\n"
+ . " ab1 | f | f | f | f | t | t | t | f | f\n"
+ . " a1! | f | f | f | f | f | t | t | f | f\n"
+ . " a c | f | f | f | f | f | f | t | f | f\n"
+ . " !.; | f | f | f | f | f | t | t | t | f\n"
+ . " | f | f | f | f | f | f | t | f | t\n"
+ . " äbç | t | f | t | f | t | t | t | f | f\n"
+ . " ÄBÇ | t | t | f | f | t | t | t | f | f\n"
+ . "(10 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ~* 'KI' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ~* 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(coalesce(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;",
+ " a | b | greatest \n"
+ . "---+-----+----------\n"
+ . " 1 | abc | CCC\n"
+ . " 2 | äbc | CCC\n"
+ . " 3 | bbc | CCC\n"
+ . " 4 | ABC | CCC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, x, lower(greatest(x, 'foo')) FROM collate_test10;",
+ " a | x | lower \n"
+ . "---+-----+-------\n"
+ . " 1 | hij | hij\n"
+ . " 2 | HIJ | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;",
+ " a | nullif \n"
+ . "---+--------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 1 | \n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(nullif(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END "
+ . "FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 1 | abcd\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE DOMAIN testdomain AS text;", "CREATE DOMAIN\n", ""],
+ [
+ "SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(x::testdomain) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT min(b), max(b) FROM collate_test1;",
+ " min | max \n"
+ . "-----+-----\n"
+ . " abc | bbc\n"
+ . "(1 row)\n"
+ . "\n",
+ ""
+ ],
+ [
+ "SELECT array_agg(b ORDER BY b) FROM collate_test1;",
+ " array_agg \n"
+ . "-------------------\n"
+ . " {abc,ABC,äbc,bbc}\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 "
+ . "UNION ALL "
+ . "SELECT a, b FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 3 | bbc\n"
+ . "(8 rows)\n"
+ . "\n"
+ ],
+ # casting
+ [
+ "SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # propagation of collation in SQL functions (inlined and non-inlined
+ # cases) and plpgsql functions too
+ [
+ "CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_noninline (text, text) "
+ . "RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 limit 1 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_plpgsql (text, text) "
+ . "RETURNS boolean LANGUAGE plpgsql "
+ . "AS \$\$ begin return \$1 < \$2; end \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a.b AS a, b.b AS b, a.b < b.b AS lt, "
+ . "mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b) "
+ . "FROM collate_test1 a, collate_test1 b "
+ . "ORDER BY a.b, b.b;",
+ " a | b | lt | mylt | mylt_noninline | mylt_plpgsql \n"
+ . "-----+-----+----+------+----------------+--------------\n"
+ . " abc | abc | f | f | f | f\n"
+ . " abc | ABC | t | t | t | t\n"
+ . " abc | äbc | t | t | t | t\n"
+ . " abc | bbc | t | t | t | t\n"
+ . " ABC | abc | f | f | f | f\n"
+ . " ABC | ABC | f | f | f | f\n"
+ . " ABC | äbc | t | t | t | t\n"
+ . " ABC | bbc | t | t | t | t\n"
+ . " äbc | abc | f | f | f | f\n"
+ . " äbc | ABC | f | f | f | f\n"
+ . " äbc | äbc | f | f | f | f\n"
+ . " äbc | bbc | t | t | t | t\n"
+ . " bbc | abc | f | f | f | f\n"
+ . " bbc | ABC | f | f | f | f\n"
+ . " bbc | äbc | f | f | f | f\n"
+ . " bbc | bbc | f | f | f | f\n"
+ . "(16 rows)\n"
+ . "\n"
+ ],
+ # polymorphism
+ [
+ "SELECT * FROM unnest("
+ . "(SELECT array_agg(b ORDER BY b) FROM collate_test1)"
+ . ") ORDER BY 1;",
+ " unnest \n"
+ . "--------\n"
+ . " abc\n"
+ . " ABC\n"
+ . " äbc\n"
+ . " bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "CREATE FUNCTION dup (anyelement) RETURNS anyelement "
+ . "AS 'select \$1' LANGUAGE sql;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a, dup(b) FROM collate_test1 ORDER BY 2;",
+ " a | dup \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # indexes
+ [
+ "CREATE INDEX collate_test1_idx1 ON collate_test1 (b);",
+ "CREATE INDEX\n"
+ ]
+ )
+);
+
+$node->stop;
diff --git a/src/test/default_collation/icu/.gitignore b/src/test/default_collation/icu/.gitignore
new file mode 100644
index 0000000..871e943
--- /dev/null
+++ b/src/test/default_collation/icu/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/icu/Makefile b/src/test/default_collation/icu/Makefile
new file mode 100644
index 0000000..5ee91d8
--- /dev/null
+++ b/src/test/default_collation/icu/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/icu/Makefile
+
+subdir = src/test/default_collation/icu
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/icu/t/001_default_collation.pl b/src/test/default_collation/icu/t/001_default_collation.pl
new file mode 100644
index 0000000..8b58be3
--- /dev/null
+++ b/src/test/default_collation/icu/t/001_default_collation.pl
@@ -0,0 +1,605 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# check whether ICU can convert C locale to a language tag
+
+my ($in_initdb, $out_initdb, $err_initdb);
+my @command = (qw(initdb -A trust -N -D), $datadir, "--locale=C\@icu");
+print "# Running: " . join(" ", @command) . "\n";
+my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb, \$err_initdb;
+
+my $c_to_icu_language_tag = (
+ not $err_initdb =~ /ICU error: could not convert locale name "C" to language tag: U_ILLEGAL_ARGUMENT_ERROR/);
+
+# get the number of tests
+
+plan tests => $c_to_icu_language_tag ? 124 : 110;
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"$expected_collprovider\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+# --locale
+
+test_initdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "libc",
+ "");
+
+test_initdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX libc locale",
+ "--locale=POSIX\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+test_initdb(
+ "POSIX icu locale",
+ "--locale=POSIX\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+test_initdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "",
+ "invalid locale name \"C\@icu\"");
+
+test_initdb(
+ "ICU language tag format locale",
+ "--locale=und-x-icu",
+ "",
+ "invalid locale name \"und-x-icu\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_initdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ "libc",
+ "");
+
+test_initdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX libc lc_collate",
+ "--lc-collate=POSIX\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu --lc-ctype=C",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));;
+
+test_initdb(
+ "POSIX icu lc_collate",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));;
+
+test_initdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "",
+ "invalid locale name \"C\@icu\"");
+
+test_initdb(
+ "ICU language tag format lc_collate",
+ "--lc-collate=und-x-icu",
+ "",
+ "invalid locale name \"und-x-icu\"");
+
+# --locale & --lc-collate
+
+test_initdb(
+ "lc_collate implicit provider takes precedence",
+ "--locale=\@icu --lc-collate=C",
+ "libc",
+ "");
+
+test_initdb(
+ "lc_collate explicit provider takes precedence",
+ "--locale=C\@libc --lc-collate=C\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb",
+ split(" ", $options),
+ "--template=template0",
+ "mydb");
+
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name,
+ $createdb_options,
+ $psql_options,
+ $expected_collprovider,
+ $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("psql",
+ split(" ", $psql_options),
+ "-c",
+ "create database mydb "
+ . $createdb_options
+ . " template = template0;");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+@command = ("createuser --createdb --no-superuser non_superuser");
+print "# Running: " . join(" ", @command) . "\n";
+system(@command);
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "libc",
+ "");
+
+test_createdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX libc locale",
+ "--locale=POSIX\@libc",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "C icu locale with SQL_ASCII encoding and superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "C icu locale with SQL_ASCII encoding and superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "C icu locale with SQL_ASCII encoding and non-superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII --username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and non-superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII --username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_createdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_createdb(
+ "ICU language tag format locale",
+ "--locale=und-x-icu",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_createdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C --lc-ctype=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX --lc-ctype=POSIX",
+ "libc",
+ "");
+
+test_createdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc --lc-ctype=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX libc lc_collate",
+ "--lc-collate=POSIX\@libc --lc-ctype=POSIX",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII "
+ . "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII",
+ "icu",
+ "");
+
+}
+else
+{
+ test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII "
+ . "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_createdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_createdb(
+ "ICU language tag format lc_collate",
+ "--lc-collate=und-x-icu",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+# test CREATE DATABASE
+
+# LC_COLLATE with the same LC_CTYPE if needed
+
+test_create_database(
+ "empty libc lc_collate",
+ "LC_COLLATE = '\@libc'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "C lc_collate without collation provider",
+ "LC_COLLATE = 'C' LC_CTYPE = 'C'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "POSIX lc_collate without collation provider",
+ "LC_COLLATE = 'POSIX' LC_CTYPE = 'POSIX'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "C libc lc_collate",
+ "LC_COLLATE = 'C\@libc' LC_CTYPE = 'C'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "POSIX libc lc_collate",
+ "LC_COLLATE = 'POSIX\@libc' LC_CTYPE = 'POSIX'",
+ "",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "",
+ "icu",
+ "");
+}
+else
+{
+ test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "",
+ "icu",
+ "");
+}
+else
+{
+ test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_create_database(
+ "C lc_collate too many modifiers",
+ "LC_COLLATE = 'C\@icu\@libc'",
+ "",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_create_database(
+ "ICU language tag format lc_collate",
+ "LC_COLLATE = 'und-x-icu'",
+ "",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+$node->stop;
diff --git a/src/test/default_collation/libc.utf8/.gitignore b/src/test/default_collation/libc.utf8/.gitignore
new file mode 100644
index 0000000..871e943
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/libc.utf8/Makefile b/src/test/default_collation/libc.utf8/Makefile
new file mode 100644
index 0000000..e5b9d20
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/libc.utf8/Makefile
+
+subdir = src/test/default_collation/libc.utf8
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/libc.utf8/t/001_default_collation.pl b/src/test/default_collation/libc.utf8/t/001_default_collation.pl
new file mode 100644
index 0000000..e4b3552
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/t/001_default_collation.pl
@@ -0,0 +1,703 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 168;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"libc\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+sub psql
+{
+ my ($command, $db) = @_;
+ my ($result, $in, $out, $err);
+ my @psql = ('psql', '-X', '-c', $command);
+ if (defined($db))
+ {
+ push(@psql, $db);
+ }
+ print "# Running: " . join(" ", @psql) . "\n";
+ $result = IPC::Run::run \@psql, \$in, \$out, \$err;
+ ($result, $out, $err);
+}
+
+# --locale
+
+test_initdb(
+ "be_BY\@latin libc locale",
+ "--locale=be_BY\@latin\@libc",
+ "");
+
+test_initdb(
+ "be_BY\@latin libc locale invalid modifier order",
+ "--locale=be_BY\@libc\@latin",
+ "invalid locale name \"be_BY\@libc\@latin\"");
+
+# --lc-collate
+
+test_initdb(
+ "be_BY\@latin libc lc_collate",
+ "--lc-collate=be_BY\@latin\@libc",
+ "");
+
+test_initdb(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@libc\@latin",
+ "invalid locale name \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test createdb, CREATE DATABASE and default collation behaviour
+
+sub test_createdb
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ if ($from_template0)
+ {
+ $options = $options . " --template=template0";
+ }
+
+ @command = ("createdb", split(" ", $options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command,
+ qr{\@libc\n},
+ "createdb: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ ($result, $out_command, $err_command) = psql(
+ "create database mydb "
+ . $options
+ . ($from_template0 ? " TEMPLATE = template0;" : ";"));
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command,
+ qr{\@libc\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_default_collation
+{
+ my ($createdb_options, $collation, @commands) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb", split(" ", $createdb_options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command, qr{\@libc\n}, "\"@command\" check output");
+
+ for (my $row = 0; $row <= $#commands; $row++)
+ {
+ my ($command_text, $expected) = @{$commands[$row]};
+ ($result, $out_command, $err_command) = psql($command_text, "mydb");
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ if ($out_command)
+ {
+ is(
+ $out_command,
+ $expected,
+ "default collation "
+ . $collation
+ . ": \""
+ . $command_text
+ . "\" check output");
+ }
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "be_BY\@latin libc locale",
+ "--locale=be_BY\@latin\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "be_BY\@latin libc locale invalid modifier order",
+ "--locale=be_BY\@libc\@latin",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\"");
+
+# --lc-collate
+
+test_createdb(
+ "be_BY\@latin libc lc_collate",
+ "--lc-collate=be_BY\@latin\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@libc\@latin",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+# LC_COLLATE
+
+test_create_database(
+ "be_BY\@latin libc lc_collate",
+ "LC_COLLATE = 'be_BY\@latin\@libc'",
+ 1,
+ "");
+
+test_create_database(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "LC_COLLATE = 'be_BY\@libc\@latin'",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test default collation behaviour
+# use commands and outputs from the regression test collate.linux.utf8
+
+test_default_collation(
+ "--lc-collate=en_US.utf8\@libc --template=template0",
+ "en_US.utf8\@libc",
+ (
+ [
+ "CREATE TABLE collate_test1 (a int, b text NOT NULL);",
+ "CREATE TABLE\n"
+ ],
+ [
+ "INSERT INTO collate_test1 VALUES "
+ . "(1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');",
+ "INSERT 0 4\n"],
+ [
+ "SELECT * FROM collate_test1 WHERE b >= 'bbc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 3 | bbc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # star expansion
+ [
+ "SELECT * FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # upper/lower
+ ["CREATE TABLE collate_test10 (a int, x text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test10 VALUES (1, 'hij'), (2, 'HIJ');",
+ "INSERT 0 2\n"
+ ],
+ [
+ "SELECT a, lower(x), upper(x), initcap(x) FROM collate_test10;",
+ " a | lower | upper | initcap \n"
+ . "---+-------+-------+---------\n"
+ . " 1 | hij | HIJ | Hij\n"
+ . " 2 | hij | HIJ | Hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ # LIKE/ILIKE
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ILIKE '%KI%' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ILIKE 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ # regular expressions
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE TABLE collate_test6 (a int, b text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test6 VALUES "
+ . "(1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'), "
+ . "(5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, ' '), "
+ . "(9, 'äbç'), (10, 'ÄBÇ');",
+ "INSERT 0 10\n"
+ ],
+ [
+ "SELECT b, "
+ . "b ~ '^[[:alpha:]]+\$' AS is_alpha, "
+ . "b ~ '^[[:upper:]]+\$' AS is_upper, "
+ . "b ~ '^[[:lower:]]+\$' AS is_lower, "
+ . "b ~ '^[[:digit:]]+\$' AS is_digit, "
+ . "b ~ '^[[:alnum:]]+\$' AS is_alnum, "
+ . "b ~ '^[[:graph:]]+\$' AS is_graph, "
+ . "b ~ '^[[:print:]]+\$' AS is_print, "
+ . "b ~ '^[[:punct:]]+\$' AS is_punct, "
+ . "b ~ '^[[:space:]]+\$' AS is_space "
+ . "FROM collate_test6;",
+ " b | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space \n"
+ . "-----+----------+----------+----------+----------+----------+----------+----------+----------+----------\n"
+ . " abc | t | f | t | f | t | t | t | f | f\n"
+ . " ABC | t | t | f | f | t | t | t | f | f\n"
+ . " 123 | f | f | f | t | t | t | t | f | f\n"
+ . " ab1 | f | f | f | f | t | t | t | f | f\n"
+ . " a1! | f | f | f | f | f | t | t | f | f\n"
+ . " a c | f | f | f | f | f | f | t | f | f\n"
+ . " !.; | f | f | f | f | f | t | t | t | f\n"
+ . " | f | f | f | f | f | f | t | f | t\n"
+ . " äbç | t | f | t | f | t | t | t | f | f\n"
+ . " ÄBÇ | t | t | f | f | t | t | t | f | f\n"
+ . "(10 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ~* 'KI' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ~* 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(coalesce(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;",
+ " a | b | greatest \n"
+ . "---+-----+----------\n"
+ . " 1 | abc | CCC\n"
+ . " 2 | äbc | CCC\n"
+ . " 3 | bbc | CCC\n"
+ . " 4 | ABC | CCC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, x, lower(greatest(x, 'foo')) FROM collate_test10;",
+ " a | x | lower \n"
+ . "---+-----+-------\n"
+ . " 1 | hij | hij\n"
+ . " 2 | HIJ | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;",
+ " a | nullif \n"
+ . "---+--------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 1 | \n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(nullif(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END "
+ . "FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 1 | abcd\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE DOMAIN testdomain AS text;", "CREATE DOMAIN\n", ""],
+ [
+ "SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(x::testdomain) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT min(b), max(b) FROM collate_test1;",
+ " min | max \n"
+ . "-----+-----\n"
+ . " abc | bbc\n"
+ . "(1 row)\n"
+ . "\n",
+ ""
+ ],
+ [
+ "SELECT array_agg(b ORDER BY b) FROM collate_test1;",
+ " array_agg \n"
+ . "-------------------\n"
+ . " {abc,ABC,äbc,bbc}\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 "
+ . "UNION ALL "
+ . "SELECT a, b FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 3 | bbc\n"
+ . "(8 rows)\n"
+ . "\n"
+ ],
+ # casting
+ [
+ "SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # propagation of collation in SQL functions (inlined and non-inlined
+ # cases) and plpgsql functions too
+ [
+ "CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_noninline (text, text) "
+ . "RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 limit 1 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_plpgsql (text, text) "
+ . "RETURNS boolean LANGUAGE plpgsql "
+ . "AS \$\$ begin return \$1 < \$2; end \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a.b AS a, b.b AS b, a.b < b.b AS lt, "
+ . "mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b) "
+ . "FROM collate_test1 a, collate_test1 b "
+ . "ORDER BY a.b, b.b;",
+ " a | b | lt | mylt | mylt_noninline | mylt_plpgsql \n"
+ . "-----+-----+----+------+----------------+--------------\n"
+ . " abc | abc | f | f | f | f\n"
+ . " abc | ABC | t | t | t | t\n"
+ . " abc | äbc | t | t | t | t\n"
+ . " abc | bbc | t | t | t | t\n"
+ . " ABC | abc | f | f | f | f\n"
+ . " ABC | ABC | f | f | f | f\n"
+ . " ABC | äbc | t | t | t | t\n"
+ . " ABC | bbc | t | t | t | t\n"
+ . " äbc | abc | f | f | f | f\n"
+ . " äbc | ABC | f | f | f | f\n"
+ . " äbc | äbc | f | f | f | f\n"
+ . " äbc | bbc | t | t | t | t\n"
+ . " bbc | abc | f | f | f | f\n"
+ . " bbc | ABC | f | f | f | f\n"
+ . " bbc | äbc | f | f | f | f\n"
+ . " bbc | bbc | f | f | f | f\n"
+ . "(16 rows)\n"
+ . "\n"
+ ],
+ # polymorphism
+ [
+ "SELECT * FROM unnest("
+ . "(SELECT array_agg(b ORDER BY b) FROM collate_test1)"
+ . ") ORDER BY 1;",
+ " unnest \n"
+ . "--------\n"
+ . " abc\n"
+ . " ABC\n"
+ . " äbc\n"
+ . " bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "CREATE FUNCTION dup (anyelement) RETURNS anyelement "
+ . "AS 'select \$1' LANGUAGE sql;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a, dup(b) FROM collate_test1 ORDER BY 2;",
+ " a | dup \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # indexes
+ [
+ "CREATE INDEX collate_test1_idx1 ON collate_test1 (b);",
+ "CREATE INDEX\n"
+ ]
+ )
+);
+
+$node->stop;
diff --git a/src/test/default_collation/libc/.gitignore b/src/test/default_collation/libc/.gitignore
new file mode 100644
index 0000000..871e943
--- /dev/null
+++ b/src/test/default_collation/libc/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/libc/Makefile b/src/test/default_collation/libc/Makefile
new file mode 100644
index 0000000..98ab736
--- /dev/null
+++ b/src/test/default_collation/libc/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/libc/Makefile
+
+subdir = src/test/default_collation/libc
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/libc/t/001_default_collation.pl b/src/test/default_collation/libc/t/001_default_collation.pl
new file mode 100644
index 0000000..bc8a6ad
--- /dev/null
+++ b/src/test/default_collation/libc/t/001_default_collation.pl
@@ -0,0 +1,355 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 90;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"libc\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+# empty locales
+
+test_initdb(
+ "empty locales",
+ "",
+ "");
+
+# --locale
+
+test_initdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "");
+
+test_initdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "");
+
+test_initdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "");
+
+test_initdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "");
+
+test_initdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ "ICU is not supported in this build");
+
+test_initdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "invalid locale name \"C\@icu\"");
+
+# --lc-collate
+
+test_initdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "");
+
+test_initdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ "");
+
+test_initdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ "");
+
+test_initdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ "");
+
+test_initdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu",
+ "ICU is not supported in this build");
+
+test_initdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "invalid locale name \"C\@icu\" \\(provider \"libc\"\\)");
+
+# --locale & --lc-collate
+
+test_initdb(
+ "lc_collate implicit provider takes precedence",
+ "--locale=\@icu --lc-collate=C",
+ "");
+
+test_initdb(
+ "lc_collate explicit provider takes precedence",
+ "--locale=\@icu --lc-collate=\@libc",
+ "");
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ if ($from_template0)
+ {
+ $options = $options . " --template=template0";
+ }
+
+ @command = ("createdb", split(" ", $options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ like($out_command,
+ qr{\@libc\n},
+ "createdb: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("psql",
+ "-c",
+ "create database mydb "
+ . $options
+ . ($from_template0 ? " template = template0" : "")
+ . ";");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ like($out_command,
+ qr{\@libc\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+# test createdb
+
+# empty locales
+
+test_createdb(
+ "empty locales",
+ "",
+ 0,
+ "");
+
+# --locale
+
+test_createdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ 0,
+ "");
+
+test_createdb(
+ "C locale without collation provider",
+ "--locale=C",
+ 1,
+ "");
+
+test_createdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ 1,
+ "");
+
+test_createdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ 1,
+ "ICU is not supported in this build");
+
+test_createdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ 1,
+ "invalid locale name: \"C\@icu\"");
+
+# --lc-collate
+
+test_createdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ 0,
+ "");
+
+test_createdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ 1,
+ "");
+test_createdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ 1,
+ "");
+
+test_createdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu",
+ 1,
+ "ICU is not supported in this build");
+
+test_createdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ 1,
+ "invalid locale name: \"C\@icu\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+# empty locales
+
+test_create_database(
+ "empty locales",
+ "",
+ 0,
+ "");
+
+# LC_COLLATE
+
+test_create_database(
+ "empty libc lc_collate",
+ "LC_COLLATE = '\@libc'",
+ 0,
+ "");
+
+test_create_database(
+ "C lc_collate without collation provider",
+ "LC_COLLATE = 'C'",
+ 1,
+ "");
+test_create_database(
+ "POSIX lc_collate without collation provider",
+ "LC_COLLATE = 'POSIX'",
+ 1,
+ "");
+
+test_create_database(
+ "C libc lc_collate",
+ "LC_COLLATE = 'C\@libc'",
+ 1,
+ "");
+
+test_create_database(
+ "C icu lc_collate",
+ "LC_COLLATE = 'C\@icu'",
+ 1,
+ "ICU is not supported in this build");
+
+test_create_database(
+ "C lc_collate too many modifiers",
+ "LC_COLLATE = 'C\@icu\@libc'",
+ 1,
+ "invalid locale name: \"C\@icu\" \\(provider \"libc\"\\)");
+
+$node->stop;
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index e1fc998..53f53d4 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -979,11 +979,14 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
-- schema manipulation commands
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -991,7 +994,7 @@ ERROR: collation "test0" already exists
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
@@ -1102,7 +1105,7 @@ drop type textrange_c;
drop type textrange_en_us;
-- cleanup
DROP SCHEMA collate_tests CASCADE;
-NOTICE: drop cascades to 18 other objects
+NOTICE: drop cascades to 19 other objects
DETAIL: drop cascades to table collate_test1
drop cascades to table collate_test_like
drop cascades to table collate_test2
@@ -1121,6 +1124,7 @@ drop cascades to function mylt_noninline(text,text)
drop cascades to function mylt_plpgsql(text,text)
drop cascades to function mylt2(text,text)
drop cascades to function dup(anyelement)
+drop cascades to function get_lc_collate(text)
RESET search_path;
-- leave a collation for pg_upgrade test
CREATE COLLATION coll_icu_upgrade FROM "und-x-icu";
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out
index 6b73186..dec1420 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.linux.utf8.out
@@ -988,11 +988,14 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
-- schema manipulation commands
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -1004,7 +1007,7 @@ NOTICE: collation "test0" for encoding "UTF8" already exists, skipping
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
@@ -1119,7 +1122,7 @@ drop type textrange_c;
drop type textrange_en_us;
-- cleanup
DROP SCHEMA collate_tests CASCADE;
-NOTICE: drop cascades to 18 other objects
+NOTICE: drop cascades to 19 other objects
DETAIL: drop cascades to table collate_test1
drop cascades to table collate_test_like
drop cascades to table collate_test2
@@ -1138,3 +1141,4 @@ drop cascades to function mylt_noninline(text,text)
drop cascades to function mylt_plpgsql(text,text)
drop cascades to function mylt2(text,text)
drop cascades to function dup(anyelement)
+drop cascades to function get_lc_collate(text)
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index ef39445..936d684 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -339,18 +339,22 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
+
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.linux.utf8.sql
index b51162e..e03ea1b 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.linux.utf8.sql
@@ -339,11 +339,15 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
+
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -352,7 +356,7 @@ CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index d8c279a..27fc2b8 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -49,7 +49,13 @@ my @contrib_excludes = (
'snapshot_too_old');
# Set of variables for frontend modules
-my $frontend_defines = { 'initdb' => 'FRONTEND' };
+my $frontend_defines = {
+ 'initdb' => 'FRONTEND',
+ 'psql' => 'FRONTEND',
+ 'pg_dump' => 'FRONTEND',
+ 'pg_dumpall' => 'FRONTEND',
+ 'pg_restore' => 'FRONTEND',
+ };
my @frontend_uselibpq = ('pg_ctl', 'pg_upgrade', 'pgbench', 'psql', 'initdb');
my @frontend_uselibpgport = (
'pg_archivecleanup', 'pg_test_fsync',
@@ -59,11 +65,14 @@ my @frontend_uselibpgcommon = (
'pg_archivecleanup', 'pg_test_fsync',
'pg_test_timing', 'pg_upgrade',
'pg_waldump', 'pgbench');
+my @iculibs = ('icuin.lib', 'icuuc.lib');
my $frontend_extralibs = {
'initdb' => ['ws2_32.lib'],
'pg_restore' => ['ws2_32.lib'],
'pgbench' => ['ws2_32.lib'],
+ 'mchar' => [@iculibs],
'psql' => ['ws2_32.lib'] };
+my @frontend_iculibs = ('initdb', 'pg_upgrade');
my $frontend_extraincludes = {
'initdb' => ['src/timezone'],
'psql' => ['src/backend'] };
@@ -111,9 +120,9 @@ sub mkvcbuild
our @pgcommonallfiles = qw(
base64.c config_info.c controldata_utils.c exec.c ip.c keywords.c
- md5.c pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c
- saslprep.c scram-common.c string.c unicode_norm.c username.c
- wait_error.c);
+ md5.c pg_collation_fn_common.c pg_lzcompress.c pgfnames.c psprintf.c
+ relpath.c rmtree.c saslprep.c scram-common.c string.c unicode_norm.c
+ username.c wait_error.c);
if ($solution->{options}->{openssl})
{
@@ -145,6 +154,7 @@ sub mkvcbuild
$libpgfeutils->AddDefine('FRONTEND');
$libpgfeutils->AddIncludeDir('src/interfaces/libpq');
$libpgfeutils->AddFiles('src/fe_utils', @pgfeutilsfiles);
+ $libpgfeutils->AddFile('src/common/pg_collation_fn_common.c');
$postgres = $solution->AddProject('postgres', 'exe', '', 'src/backend');
$postgres->AddIncludeDir('src/backend');
@@ -228,6 +238,7 @@ sub mkvcbuild
'src/interfaces/libpq');
$libpq->AddDefine('FRONTEND');
$libpq->AddDefine('UNSAFE_STAT_OK');
+ $libpq->AddDefine('LIBPQ_MAKE');
$libpq->AddIncludeDir('src/port');
$libpq->AddLibrary('secur32.lib');
$libpq->AddLibrary('ws2_32.lib');
@@ -236,6 +247,7 @@ sub mkvcbuild
$libpq->ReplaceFile('src/interfaces/libpq/libpqrc.c',
'src/interfaces/libpq/libpq.rc');
$libpq->AddReference($libpgport);
+ $libpq->AddFile('src/common/pg_collation_fn_common.c');
# The OBJS scraper doesn't know about ifdefs, so remove fe-secure-openssl.c
# and sha2_openssl.c if building without OpenSSL, and remove sha2.c if
@@ -420,6 +432,12 @@ sub mkvcbuild
{
push @contrib_excludes, 'uuid-ossp';
}
+ else
+ {
+ foreach my $fe (@frontend_iculibs) {
+ push @{$frontend_extralibs->{$fe}}, @iculibs;
+ }
+ }
# AddProject() does not recognize the constructs used to populate OBJS in
# the pgcrypto Makefile, so it will discover no files.
--
2.7.4
Hi everyone!
10 февр. 2018 г., в 20:45, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):
I'm planning to provide review
So, I was looking into the patch.
The patch adds:
1. Ability to specify collation provider (with version) in --locale for initdb and createdb.
2. Changes to locale checks
3. Sets ICU as default collation provider. For example "ru_RU@icu.153.80.32.1" is default on my machine with patch
4. Tests and necessary changes to documentation
With patch I get correct ICU ordering by default
postgres=# select unnest(array['е','ё','ж']) order by 1;
unnest
--------
е
ё
ж
(3 rows)
While libc locale provides incorrect order (I also get same ordering by default without patch)
postgres=# select c from unnest(array['е','ё','ж']) c order by c collate "ru_RU";
c
---
е
ж
ё
(3 rows)
Unfortunately, neither "ru_RU@icu.153.80.32.1" (exposed by LC_COLLATE and other places) nor "ru_RU@icu" cannot be used by collate SQL clause.
Also, patch removes compatibility with MSVC 1800 (Visual Studio 2013) on Windows XP and Windows Server 2003. This is done to use newer locale-related functions in VS2013 build.
If the database was initialized with default locale without this patch, one cannot connect to it anymore
psql: FATAL: could not find out the collation provider for datcollate "ru_RU.UTF-8" of database "postgres"
This problem is mentioned in commit message of the patch. I think that this problem should be addressed somehow.
What do you think?
Overall patch looks solid and thoughtful work and adds important functionality.
Best regards, Andrey Borodin.
Hello.
Just want to inform:
I have run
check,installcheck,plcheck,contribcheck,modulescheck,ecpgcheck,isolationcheck,upgradecheck
tests on Windows 10, VC2017 with patch applied on top of
2a41507dab0f293ff241fe8ae326065998668af8 as Andrey asked me.
Everything is passing with and without $config->{icu} =
'D:\Dev\postgres\icu\';
Best regards,
Michail.
пт, 16 февр. 2018 г. в 11:13, Andrey Borodin <x4mmm@yandex-team.ru>:
Show quoted text
Hi everyone!
10 февр. 2018 г., в 20:45, Andrey Borodin <x4mmm@yandex-team.ru>
написал(а):
I'm planning to provide review
So, I was looking into the patch.
The patch adds:
1. Ability to specify collation provider (with version) in --locale for
initdb and createdb.
2. Changes to locale checks
3. Sets ICU as default collation provider. For example
"ru_RU@icu.153.80.32.1" is default on my machine with patch
4. Tests and necessary changes to documentationWith patch I get correct ICU ordering by default
postgres=# select unnest(array['е','ё','ж']) order by 1;
unnest
--------
е
ё
ж
(3 rows)While libc locale provides incorrect order (I also get same ordering by
default without patch)postgres=# select c from unnest(array['е','ё','ж']) c order by c collate
"ru_RU";
c
---
е
ж
ё
(3 rows)Unfortunately, neither "ru_RU@icu.153.80.32.1" (exposed by LC_COLLATE and
other places) nor "ru_RU@icu" cannot be used by collate SQL clause.
Also, patch removes compatibility with MSVC 1800 (Visual Studio 2013) on
Windows XP and Windows Server 2003. This is done to use newer
locale-related functions in VS2013 build.If the database was initialized with default locale without this patch,
one cannot connect to it anymore
psql: FATAL: could not find out the collation provider for datcollate
"ru_RU.UTF-8" of database "postgres"
This problem is mentioned in commit message of the patch. I think that
this problem should be addressed somehow.
What do you think?Overall patch looks solid and thoughtful work and adds important
functionality.Best regards, Andrey Borodin.
Hi,
On 2018-02-10 20:45:40 +0500, Andrey Borodin wrote:
I've contacted Postgres Professional. Marina Polyakova had kindly provided their patch.
The patch allows to use libc locale with ICU collation as default for cluster or database.It seems that this patch brings important long-awaited feature and deserves to be included in last v11 commitfest.
Peter, everyone, do you agree with this? Or should we better adapt this work through v12 cycle?I'm planning to provide review asap and do necessary changes if required (this was discussed with Marina and Postgres Professional).
This patch was submitted for the last v11 commitfest, it's not a trivial
patch, and hasn't yet been reviewed. I'm afraid the policy is that large
patches shouldn't be submitted for the last commitfest... Thus I think
this should be moved to the next one.
Greetings,
Andres Freund
On 3/2/18 1:14 AM, Andres Freund wrote:
On 2018-02-10 20:45:40 +0500, Andrey Borodin wrote:
I've contacted Postgres Professional. Marina Polyakova had kindly provided their patch.
The patch allows to use libc locale with ICU collation as default for cluster or database.It seems that this patch brings important long-awaited feature and deserves to be included in last v11 commitfest.
Peter, everyone, do you agree with this? Or should we better adapt this work through v12 cycle?I'm planning to provide review asap and do necessary changes if required (this was discussed with Marina and Postgres Professional).
This patch was submitted for the last v11 commitfest, it's not a trivial
patch, and hasn't yet been reviewed. I'm afraid the policy is that large
patches shouldn't be submitted for the last commitfest... Thus I think
this should be moved to the next one.
This patch has been moved to the next CF.
Regards,
--
-David
david@pgmasters.net
Andrey Borodin wrote:
Overall patch looks solid and thoughtful work and adds important
functionality.
I tried the patch, with some minor changes to build with HEAD.
I was surprised by the interface, that is, the fact that a user is
not allowed to freely choose the ICU collation of a database, in
constrast with CREATE COLLATION.
AFAIU, when the "default collation provider" is ICU, CREATE DATABASE
still expects a libc locale in the lc_collate/lc_ctype arguments.
The code will automatically find an ICU equivalent by matching the
language, and it seems that the country is ignored?
So if we wanted a database with an ICU collation like, say,
"es@collation=traditional" or "es-u-co-trad" as expressed with a
BCP-47 tag, or anything that is not defined by only a language,
would it be possible? I have the impression it wouldn't.
This is not something that could be easily improved after the fact
because getting the ICU collation through a libc collation is a
user-interface choice.
I think users would rather be able to create a database with
something like:
CREATE DATABASE foo
COLLPROVIDER='icu'
LOCALE='icu_locale' |
LC_COLLATE='icu_locale' | LC_CTYPE='icu_locale'
...
which would be in line with CREATE COLLATION.
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
I wrote:
I tried the patch, with some minor changes to build with HEAD.
PFA a rebased version.
(for some reason http://cfbot.cputube.org/ did not pick up the initial patch;
there's an entry for it but the rightmost column is empty,
no link to patch and no green/red success/failure icons.
It seems it's the only entry in this state).
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
Attachments:
ICU-as-default-collation-provider-rebased-jul-2018.patchtext/plainDownload
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index dc3fd34..f28e0ec 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -537,6 +537,61 @@ SELECT * FROM test1 ORDER BY a || b COLLATE "fr_FR";
a database.
</para>
+ <para>
+ You can specify the default collation provider with the <option>--locale</option>
+ and <option>--lc-collate</option> options of the <xref linkend="app-initdb"/> or
+ <xref linkend="app-createdb"/> commands, as follows:
+<programlisting>
+--locale=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]
+--lc-collate=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]
+</programlisting>
+ where <replaceable>provider</replaceable> can take the <literal>icu</literal>
+ or <literal>libc</literal> value, and <replaceable>locale</replaceable> is specified
+ in the <literal>libc</literal> format. You can only specify a single
+ locale provider after the <literal>@</literal> symbol.
+ The <literal>--lc-collate</literal> option overrides the
+ <literal>--locale</literal> setting, regardless of whether it specifies the
+ collation provider.
+ </para>
+
+ <para>
+ If you omit the collation provider options, <literal>libc</literal>
+ provider is used for <literal>C</literal> and <literal>POSIX</literal>
+ locales. For other locales, the default providers are:
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ <literal>icu</literal> at the cluster level
+ </para>
+ </listitem>
+ <listitem>
+ <para>Default collation provider from the template database at
+ the database level
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <important>
+ <para>
+ You can only use the <literal>icu</literal> collation provider for locales that are
+ supported by <literal>libc</literal> in your operating system and satisfy all
+ restrictions applicable to <literal>icu</literal>.
+ </para>
+ </important>
+
+ <para>
+ When you connect to a database,
+ <productname>PostgreSQL</productname> checks that the selected collation
+ provider and the version of the default collation are supported.
+ You can find the default database collation and the collation provider
+ in <structname>pg_database.datcollate</structname>. For ICU collations, collation version is
+ also stored:
+ <programlisting>
+<replaceable>locale</replaceable>@<replaceable>provider</replaceable>[.<replaceable>version</replaceable>]
+</programlisting>
+ </para>
+
<sect3>
<title>Standard Collations</title>
diff --git a/doc/src/sgml/ref/create_database.sgml b/doc/src/sgml/ref/create_database.sgml
index b2c9e24..8b2e153 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -25,7 +25,7 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
[ [ WITH ] [ OWNER [=] <replaceable class="parameter">user_name</replaceable> ]
[ TEMPLATE [=] <replaceable class="parameter">template</replaceable> ]
[ ENCODING [=] <replaceable class="parameter">encoding</replaceable> ]
- [ LC_COLLATE [=] <replaceable class="parameter">lc_collate</replaceable> ]
+ [ LC_COLLATE [=] <replaceable class="parameter">lc_collate</replaceable>[@<replaceable class="parameter">provider</replaceable>] ]
[ LC_CTYPE [=] <replaceable class="parameter">lc_ctype</replaceable> ]
[ TABLESPACE [=] <replaceable class="parameter">tablespace_name</replaceable> ]
[ ALLOW_CONNECTIONS [=] <replaceable class="parameter">allowconn</replaceable> ]
@@ -112,13 +112,17 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
</listitem>
</varlistentry>
<varlistentry>
- <term><replaceable class="parameter">lc_collate</replaceable></term>
+ <term><replaceable class="parameter">lc_collate</replaceable>[@<replaceable class="parameter">provider</replaceable>]</term>
<listitem>
<para>
Collation order (<literal>LC_COLLATE</literal>) to use in the new database.
This affects the sort order applied to strings, e.g. in queries with
ORDER BY, as well as the order used in indexes on text columns.
The default is to use the collation order of the template database.
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol, as explained in
+ <xref linkend="collation-managing"/>. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>.
See below for additional restrictions.
</para>
</listitem>
diff --git a/doc/src/sgml/ref/createdb.sgml b/doc/src/sgml/ref/createdb.sgml
index 2658efe..dbf87d3 100644
--- a/doc/src/sgml/ref/createdb.sgml
+++ b/doc/src/sgml/ref/createdb.sgml
@@ -121,22 +121,34 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>-l <replaceable class="parameter">locale</replaceable></option></term>
- <term><option>--locale=<replaceable class="parameter">locale</replaceable></option></term>
+ <term><option>-l <replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
+ <term><option>--locale=<replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Specifies the locale to be used in this database. This is equivalent
to specifying both <option>--lc-collate</option> and <option>--lc-ctype</option>.
</para>
+
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
<varlistentry>
- <term><option>--lc-collate=<replaceable class="parameter">locale</replaceable></option></term>
+ <term><option>--lc-collate=<replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Specifies the LC_COLLATE setting to be used in this database.
</para>
+
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 4489b58..738e41b 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -222,7 +222,7 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>--locale=<replaceable>locale</replaceable></option></term>
+ <term><option>--locale=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Sets the default locale for the database cluster. If this
@@ -230,11 +230,16 @@ PostgreSQL documentation
environment that <command>initdb</command> runs in. Locale
support is described in <xref linkend="locale"/>.
</para>
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
<varlistentry>
- <term><option>--lc-collate=<replaceable>locale</replaceable></option></term>
+ <term><option>--lc-collate=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<term><option>--lc-ctype=<replaceable>locale</replaceable></option></term>
<term><option>--lc-messages=<replaceable>locale</replaceable></option></term>
<term><option>--lc-monetary=<replaceable>locale</replaceable></option></term>
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 673a8c2..dbc5279 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -328,6 +328,23 @@ make check EXTRA_TESTS='collate.icu.utf8 collate.linux.utf8' LANG=en_US.utf8
</sect2>
<sect2>
+ <title>Extra TAP Tests for Default Collations</title>
+
+ <para>
+ To test the default collations on Linux/glibc platforms,
+ you can run extra TAP tests, as follows:
+<screen>
+make -C src/test/default_collation check-utf8
+</screen>
+ These tests only succeed when run in a database that uses the UTF-8
+ encoding. As these tests are TAP-based, you can only run them if
+ <productname>PostgreSQL</productname> was configured with the
+ <option>--enable-tap-tests</option> option.
+ For details, see <xref linkend="regress-tap"/>.
+ </para>
+ </sect2>
+
+ <sect2>
<title>Testing Hot Standby</title>
<para>
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index f4e69f4..8d34006 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -397,7 +397,7 @@ CREATE VIEW character_sets AS
CAST(c.collname AS sql_identifier) AS default_collate_name
FROM pg_database d
LEFT JOIN (pg_collation c JOIN pg_namespace nc ON (c.collnamespace = nc.oid))
- ON (datcollate = collcollate AND datctype = collctype)
+ ON (datcollate = (collcollate || '@libc') AND datctype = collctype)
WHERE d.datname = current_database()
ORDER BY char_length(c.collname) DESC, c.collname ASC -- prefer full/canonical name
LIMIT 1;
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 8fb51e8..6846ebc 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -27,6 +27,7 @@
#include "commands/comment.h"
#include "commands/dbcommands.h"
#include "commands/defrem.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "utils/builtins.h"
@@ -162,11 +163,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
if (collproviderstr)
{
- if (pg_strcasecmp(collproviderstr, "icu") == 0)
- collprovider = COLLPROVIDER_ICU;
- else if (pg_strcasecmp(collproviderstr, "libc") == 0)
- collprovider = COLLPROVIDER_LIBC;
- else
+ collprovider = get_collprovider(collproviderstr);
+ if (!is_valid_nondefault_collprovider(collprovider))
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("unrecognized collation provider: %s",
@@ -192,7 +190,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
else
{
collencoding = GetDatabaseEncoding();
- check_encoding_locale_matches(collencoding, collcollate, collctype);
+ check_encoding_locale_matches(collencoding, collcollate, collctype,
+ collprovider);
}
}
@@ -434,26 +433,6 @@ cmpaliases(const void *a, const void *b)
#ifdef USE_ICU
/*
- * Get the ICU language tag for a locale name.
- * The result is a palloc'd string.
- */
-static char *
-get_icu_language_tag(const char *localename)
-{
- char buf[ULOC_FULLNAME_CAPACITY];
- UErrorCode status;
-
- status = U_ZERO_ERROR;
- uloc_toLanguageTag(localename, buf, sizeof(buf), TRUE, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not convert locale name \"%s\" to language tag: %s",
- localename, u_errorName(status))));
-
- return pstrdup(buf);
-}
-
-/*
* Get a comment (specifically, the display name) for an ICU locale.
* The result is a palloc'd string, or NULL if we can't get a comment
* or find that it's not all ASCII. (We can *not* accept non-ASCII
@@ -698,7 +677,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
name = uloc_getAvailable(i);
langtag = get_icu_language_tag(name);
- collcollate = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : name;
+ collcollate = get_icu_collate(name, langtag);
/*
* Be paranoid about not allowing any non-ASCII strings into
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 5342f21..610fece 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -34,6 +34,7 @@
#include "catalog/indexing.h"
#include "catalog/objectaccess.h"
#include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
#include "catalog/pg_db_role_setting.h"
#include "catalog/pg_subscription.h"
@@ -44,6 +45,7 @@
#include "commands/defrem.h"
#include "commands/seclabel.h"
#include "commands/tablespace.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -141,6 +143,14 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
int notherbackends;
int npreparedxacts;
createdb_failure_params fparms;
+ char *src_canonname;
+ char src_collprovider;
+ char *dbcanonname = NULL;
+ char dbcollprovider;
+ char *dbcollate_full_name;
+ char *icu_wincollate = NULL;
+ char *langtag = NULL;
+ const char *collate;
/* Extract options from the statement node tree */
foreach(option, stmt->options)
@@ -350,8 +360,28 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
/* If encoding or locales are defaulted, use source's setting */
if (encoding < 0)
encoding = src_encoding;
+
+ check_locale_collprovider(src_collate, &src_canonname, &src_collprovider,
+ NULL);
+
+ if (!is_valid_nondefault_collprovider(src_collprovider))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of template database \"%s\"",
+ src_collate, dbtemplate)));
+
if (dbcollate == NULL)
- dbcollate = src_collate;
+ {
+ dbcollate = src_canonname;
+ dbcollprovider = src_collprovider;
+ }
+ else
+ {
+ check_locale_collprovider(dbcollate, &dbcanonname, &dbcollprovider,
+ NULL);
+ dbcollate = dbcanonname;
+ }
+
if (dbctype == NULL)
dbctype = src_ctype;
@@ -362,18 +392,88 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
errmsg("invalid server encoding %d", encoding)));
/* Check that the chosen locales are valid, and get canonical spellings */
- if (!check_locale(LC_COLLATE, dbcollate, &canonname))
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("invalid locale name: \"%s\"", dbcollate)));
- dbcollate = canonname;
- if (!check_locale(LC_CTYPE, dbctype, &canonname))
+
+ if (!check_locale(LC_CTYPE, dbctype, &canonname, '\0'))
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("invalid locale name: \"%s\"", dbctype)));
dbctype = canonname;
- check_encoding_locale_matches(encoding, dbcollate, dbctype);
+ /* we always check lc_collate for libc */
+ if (!check_locale(LC_COLLATE, dbcollate, &canonname, COLLPROVIDER_LIBC))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("invalid locale name: \"%s\" (provider \"%s\")",
+ dbcollate, get_collprovider_name(COLLPROVIDER_LIBC))));
+ dbcollate = canonname;
+
+ /* determine the collation provider if we haven't already done it */
+ if (!is_valid_nondefault_collprovider(dbcollprovider))
+ {
+ if (locale_is_c(dbcollate))
+ dbcollprovider = COLLPROVIDER_LIBC;
+ else
+ dbcollprovider = src_collprovider;
+ }
+
+ Assert(is_valid_nondefault_collprovider(dbcollprovider));
+
+#ifndef USE_ICU
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"),
+ errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif
+
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ {
+ if (!check_locale(LC_COLLATE, dbcollate, NULL, dbcollprovider))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("invalid locale name: \"%s\" (provider \"%s\")",
+ dbcollate, get_collprovider_name(dbcollprovider))));
+
+ if (strcmp(dbcollate, dbctype) != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("collations with different collate and ctype values are not supported by ICU")));
+ }
+
+ check_encoding_locale_matches(encoding, dbcollate, dbctype, dbcollprovider);
+
+ /* get the collation version */
+
+#ifdef USE_ICU
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ {
+ collate = (const char *) dbcollate;
+#ifdef WIN32
+ if (!locale_is_c(collate))
+ {
+ icu_wincollate = check_icu_winlocale(collate);
+ collate = (const char *) icu_wincollate;
+ }
+#endif /* WIN32 */
+ langtag = get_icu_language_tag(collate);
+ collate = get_icu_collate(collate, langtag);
+ }
+ else
+#endif /* USE_ICU */
+ {
+ /* COLLPROVIDER_LIBC */
+ collate = (const char *) dbcollate;
+ }
+
+ dbcollate_full_name = get_full_collation_name(
+ dbcollate, dbcollprovider,
+ get_collation_actual_version(dbcollprovider, collate));
+
+ if (strlen(dbcollate_full_name) >= NAMEDATALEN)
+ ereport(ERROR,
+ (errmsg("the full database collation name \"%s\" is too long",
+ dbcollate_full_name)));
/*
* Check that the new encoding and locale settings match the source
@@ -395,11 +495,11 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
pg_encoding_to_char(src_encoding)),
errhint("Use the same encoding as in the template database, or use template0 as template.")));
- if (strcmp(dbcollate, src_collate) != 0)
+ if (strcmp(dbcollate_full_name, src_collate) != 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("new collation (%s) is incompatible with the collation of the template database (%s)",
- dbcollate, src_collate),
+ dbcollate_full_name, src_collate),
errhint("Use the same collation as in the template database, or use template0 as template.")));
if (strcmp(dbctype, src_ctype) != 0)
@@ -522,7 +622,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
new_record[Anum_pg_database_datdba - 1] = ObjectIdGetDatum(datdba);
new_record[Anum_pg_database_encoding - 1] = Int32GetDatum(encoding);
new_record[Anum_pg_database_datcollate - 1] =
- DirectFunctionCall1(namein, CStringGetDatum(dbcollate));
+ DirectFunctionCall1(namein, CStringGetDatum(dbcollate_full_name));
new_record[Anum_pg_database_datctype - 1] =
DirectFunctionCall1(namein, CStringGetDatum(dbctype));
new_record[Anum_pg_database_datistemplate - 1] = BoolGetDatum(dbistemplate);
@@ -690,6 +790,16 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
*/
ForceSyncCommit();
}
+
+ pfree(src_canonname);
+ pfree(dbcollate_full_name);
+ if (dbcanonname)
+ pfree(dbcanonname);
+ if (langtag)
+ pfree(langtag);
+ if (icu_wincollate)
+ pfree(icu_wincollate);
+
PG_END_ENSURE_ERROR_CLEANUP(createdb_failure_callback,
PointerGetDatum(&fparms));
@@ -719,7 +829,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
* Note: if you change this policy, fix initdb to match.
*/
void
-check_encoding_locale_matches(int encoding, const char *collate, const char *ctype)
+check_encoding_locale_matches(int encoding, const char *collate, const char *ctype,
+ char collprovider)
{
int ctype_encoding = pg_get_encoding_from_locale(ctype, true);
int collate_encoding = pg_get_encoding_from_locale(collate, true);
@@ -753,6 +864,23 @@ check_encoding_locale_matches(int encoding, const char *collate, const char *cty
collate),
errdetail("The chosen LC_COLLATE setting requires encoding \"%s\".",
pg_encoding_to_char(collate_encoding))));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ if (!(is_encoding_supported_by_icu(encoding) ||
+ (encoding == PG_SQL_ASCII && superuser())))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("encoding \"%s\" is not supported for ICU locales",
+ pg_encoding_to_char(encoding))));
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"),
+ errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif
+ }
}
/* Error cleanup callback for createdb */
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 38853e3..cb27d62 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -32,6 +32,7 @@
#endif
#include "bootstrap/bootstrap.h"
+#include "catalog/pg_collation.h"
#include "common/username.h"
#include "port/atomics.h"
#include "postmaster/postmaster.h"
@@ -306,8 +307,8 @@ startup_hacks(const char *progname)
static void
init_locale(const char *categoryname, int category, const char *locale)
{
- if (pg_perm_setlocale(category, locale) == NULL &&
- pg_perm_setlocale(category, "C") == NULL)
+ if (pg_perm_setlocale(category, locale, COLLPROVIDER_LIBC) == NULL &&
+ pg_perm_setlocale(category, "C", COLLPROVIDER_LIBC) == NULL)
elog(FATAL, "could not adopt \"%s\" locale nor C locale for %s",
locale, categoryname);
}
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index acbed2e..e836553 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -16,6 +16,7 @@
*/
#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
#include "utils/pg_locale.h"
/*
@@ -240,8 +241,13 @@ pg_set_regex_collation(Oid collation)
}
else
{
+ char collprovider;
+
if (collation == DEFAULT_COLLATION_OID)
+ {
pg_regex_locale = 0;
+ collprovider = get_default_collprovider();
+ }
else if (OidIsValid(collation))
{
/*
@@ -250,6 +256,7 @@ pg_set_regex_collation(Oid collation)
* have to be considered below.
*/
pg_regex_locale = pg_newlocale_from_collation(collation);
+ collprovider = pg_regex_locale->provider;
}
else
{
@@ -263,24 +270,35 @@ pg_set_regex_collation(Oid collation)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
#ifdef USE_ICU
- if (pg_regex_locale && pg_regex_locale->provider == COLLPROVIDER_ICU)
pg_regex_strategy = PG_REGEX_LOCALE_ICU;
- else
+#else
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- if (pg_regex_locale)
- pg_regex_strategy = PG_REGEX_LOCALE_WIDE_L;
- else
- pg_regex_strategy = PG_REGEX_LOCALE_WIDE;
}
else
{
- if (pg_regex_locale)
- pg_regex_strategy = PG_REGEX_LOCALE_1BYTE_L;
+ /* COLLPROVIDER_LIBC */
+
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ if (pg_regex_locale)
+ pg_regex_strategy = PG_REGEX_LOCALE_WIDE_L;
+ else
+ pg_regex_strategy = PG_REGEX_LOCALE_WIDE;
+ }
else
- pg_regex_strategy = PG_REGEX_LOCALE_1BYTE;
+ {
+ if (pg_regex_locale)
+ pg_regex_strategy = PG_REGEX_LOCALE_1BYTE_L;
+ else
+ pg_regex_strategy = PG_REGEX_LOCALE_1BYTE;
+ }
}
pg_regex_collation = collation;
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 30696e3..c0be175 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -1446,7 +1446,7 @@ typedef int32_t (*ICU_Convert_Func) (UChar *dest, int32_t destCapacity,
UErrorCode *pErrorCode);
static int32_t
-icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
+icu_convert_case(ICU_Convert_Func func, const char *locale,
UChar **buff_dest, UChar *buff_source, int32_t len_source)
{
UErrorCode status;
@@ -1456,7 +1456,7 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
*buff_dest = palloc(len_dest * sizeof(**buff_dest));
status = U_ZERO_ERROR;
len_dest = func(*buff_dest, len_dest, buff_source, len_source,
- mylocale->info.icu.locale, &status);
+ locale, &status);
if (status == U_BUFFER_OVERFLOW_ERROR)
{
/* try again with adjusted length */
@@ -1464,7 +1464,7 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
*buff_dest = palloc(len_dest * sizeof(**buff_dest));
status = U_ZERO_ERROR;
len_dest = func(*buff_dest, len_dest, buff_source, len_source,
- mylocale->info.icu.locale, &status);
+ locale, &status);
}
if (U_FAILURE(status))
ereport(ERROR,
@@ -1522,8 +1522,15 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1537,25 +1544,43 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar;
int32_t len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToLower, mylocale,
+ len_conv = icu_convert_case(u_strToLower, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
@@ -1644,8 +1669,15 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1659,25 +1691,43 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar,
len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToUpper, mylocale,
+ len_conv = icu_convert_case(u_strToUpper, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
@@ -1767,8 +1817,15 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1782,25 +1839,43 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar,
len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale,
+ len_conv = icu_convert_case(u_strToTitle_default_BI, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index ff716c5..28ea64f 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -167,6 +167,9 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
plen;
pg_locale_t locale = 0;
bool locale_is_c = false;
+ char collprovider = COLLPROVIDER_LIBC;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY;
+ bool use_icu;
if (lc_ctype_is_c(collation))
locale_is_c = true;
@@ -184,7 +187,18 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collation);
+ collprovider = locale->provider;
}
+ else
+ {
+ collprovider = get_default_collprovider();
+ }
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
/*
* For efficiency reasons, in the single byte case we don't call lower()
@@ -194,7 +208,7 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
* way.
*/
- if (pg_database_encoding_max_length() > 1 || (locale && locale->provider == COLLPROVIDER_ICU))
+ if (pg_database_encoding_max_length() > 1 || use_icu)
{
/* lower's result is never packed, so OK to use old macros here */
pat = DatumGetTextPP(DirectFunctionCall1Coll(lower, collation,
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index a3dc3be..5d7c66b 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -56,7 +56,10 @@
#include "access/htup_details.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_control.h"
+#include "catalog/pg_database.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
+#include "miscadmin.h"
#include "utils/builtins.h"
#include "utils/hsearch.h"
#include "utils/lsyscache.h"
@@ -132,6 +135,10 @@ static HTAB *collation_cache = NULL;
static char *IsoLocaleName(const char *); /* MSVC specific */
#endif
+#ifdef USE_ICU
+static char *check_icu_locale(const char *locale);
+#endif
+
/*
* pg_perm_setlocale
@@ -146,13 +153,45 @@ static char *IsoLocaleName(const char *); /* MSVC specific */
* also be unset to fully ensure that, but that has to be done elsewhere after
* all the individual LC_XXX variables have been set correctly. (Thank you
* Perl for making this kluge necessary.)
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
-char *
-pg_perm_setlocale(int category, const char *locale)
+const char *
+pg_perm_setlocale(int category, const char *locale, char collprovider)
{
- char *result;
+ const char *result;
const char *envvar;
char *envbuf;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
+ {
+#ifdef USE_ICU
+ UErrorCode status = U_ZERO_ERROR;
+ char *icu_locale = check_icu_locale(locale);
+
+ if (icu_locale == NULL && locale != NULL)
+ return NULL; /* fall out immediately on failure */
+
+ uloc_setDefault(icu_locale, &status);
+ if (U_FAILURE(status))
+ return NULL; /* fall out immediately on failure */
+
+ result = uloc_getDefault();
+ if (icu_locale)
+ pfree(icu_locale);
+ return result;
+#else /* not USE_ICU */
+ return NULL; /* fall out immediately on failure */
+#endif /* not USE_ICU */
+ }
+
+ /* use libc */
#ifndef WIN32
result = setlocale(category, locale);
@@ -167,7 +206,7 @@ pg_perm_setlocale(int category, const char *locale)
#ifdef LC_MESSAGES
if (category == LC_MESSAGES)
{
- result = (char *) locale;
+ result = locale;
if (locale == NULL || locale[0] == '\0')
return result;
}
@@ -218,7 +257,7 @@ pg_perm_setlocale(int category, const char *locale)
#ifdef WIN32
result = IsoLocaleName(locale);
if (result == NULL)
- result = (char *) locale;
+ result = locale;
#endif /* WIN32 */
break;
#endif /* LC_MESSAGES */
@@ -259,34 +298,102 @@ pg_perm_setlocale(int category, const char *locale)
* it seems that on most implementations that's the only thing it's good for;
* we could wish that setlocale gave back a canonically spelled version of
* the locale name, but typically it doesn't.)
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
bool
-check_locale(int category, const char *locale, char **canonname)
+check_locale(int category, const char *locale, char **canonname,
+ char collprovider)
{
- char *save;
- char *res;
+ const char *save;
+ const char *res;
+ char *save_dup;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+#ifdef USE_ICU
+ UErrorCode status;
+ char *icu_locale;
+#endif
+
+ Assert(use_libc || use_icu);
if (canonname)
*canonname = NULL; /* in case of failure */
- save = setlocale(category, NULL);
- if (!save)
- return false; /* won't happen, we hope */
+#ifndef USE_ICU
+ /* cannot use icu functions */
+ if (use_icu)
+ return false;
+#endif
+
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ save = uloc_getDefault();
+ if (!save)
+ return false; /* won't happen, we hope */
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ save = setlocale(category, NULL);
+ if (!save)
+ return false; /* won't happen, we hope */
+ }
/* save may be pointing at a modifiable scratch variable, see above. */
- save = pstrdup(save);
+ save_dup = pstrdup(save);
/* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ icu_locale = check_icu_locale(locale);
+
+ if (icu_locale == NULL && locale != NULL)
+ return false; /* won't happen, we hope */
+
+ status = U_ZERO_ERROR;
+ uloc_setDefault(icu_locale, &status);
+ if (U_FAILURE(status))
+ return false; /* won't happen, we hope */
+
+ res = uloc_getDefault();
+ if (icu_locale)
+ pfree(icu_locale);
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ res = setlocale(category, locale);
+ }
/* save canonical name if requested. */
if (res && canonname)
*canonname = pstrdup(res);
/* restore old value. */
- if (!setlocale(category, save))
- elog(WARNING, "failed to restore old locale \"%s\"", save);
- pfree(save);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ elog(WARNING, "ICU error: failed to restore old locale \"%s\"",
+ save_dup);
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ if (!setlocale(category, save_dup))
+ elog(WARNING, "failed to restore old locale \"%s\"", save_dup);
+ }
+ pfree(save_dup);
return (res != NULL);
}
@@ -306,7 +413,7 @@ check_locale(int category, const char *locale, char **canonname)
bool
check_locale_monetary(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_MONETARY, *newval, NULL);
+ return check_locale(LC_MONETARY, *newval, NULL, '\0');
}
void
@@ -318,7 +425,7 @@ assign_locale_monetary(const char *newval, void *extra)
bool
check_locale_numeric(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_NUMERIC, *newval, NULL);
+ return check_locale(LC_NUMERIC, *newval, NULL, '\0');
}
void
@@ -330,7 +437,7 @@ assign_locale_numeric(const char *newval, void *extra)
bool
check_locale_time(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_TIME, *newval, NULL);
+ return check_locale(LC_TIME, *newval, NULL, '\0');
}
void
@@ -366,7 +473,7 @@ check_locale_messages(char **newval, void **extra, GucSource source)
* On Windows, we can't even check the value, so accept blindly
*/
#if defined(LC_MESSAGES) && !defined(WIN32)
- return check_locale(LC_MESSAGES, *newval, NULL);
+ return check_locale(LC_MESSAGES, *newval, NULL, '\0');
#else
return true;
#endif
@@ -380,7 +487,7 @@ assign_locale_messages(const char *newval, void *extra)
* We ignore failure, as per comment above.
*/
#ifdef LC_MESSAGES
- (void) pg_perm_setlocale(LC_MESSAGES, newval);
+ (void) pg_perm_setlocale(LC_MESSAGES, newval, '\0');
#endif
}
@@ -1096,21 +1203,14 @@ lookup_collation_cache(Oid collation, bool set_flags)
/* Attempt to set the flags */
HeapTuple tp;
Form_pg_collation collform;
- const char *collcollate;
- const char *collctype;
tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collation));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for collation %u", collation);
collform = (Form_pg_collation) GETSTRUCT(tp);
- collcollate = NameStr(collform->collcollate);
- collctype = NameStr(collform->collctype);
-
- cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
- (strcmp(collcollate, "POSIX") == 0));
- cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
- (strcmp(collctype, "POSIX") == 0));
+ cache_entry->collate_is_c = locale_is_c(NameStr(collform->collcollate));
+ cache_entry->ctype_is_c = locale_is_c(NameStr(collform->collctype));
cache_entry->flags_valid = true;
@@ -1141,20 +1241,28 @@ lc_collate_is_c(Oid collation)
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
+ char collprovider;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_COLLATE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_COLLATE setting");
-
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
+
+ collprovider = get_default_collprovider();
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
result = false;
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ char *localeptr = setlocale(LC_COLLATE, NULL);
+
+ if (!localeptr)
+ elog(ERROR, "invalid LC_COLLATE setting");
+
+ result = locale_is_c(localeptr);
+ }
return (bool) result;
}
@@ -1191,20 +1299,28 @@ lc_ctype_is_c(Oid collation)
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
+ char collprovider;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_CTYPE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_CTYPE setting");
-
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
+
+ collprovider = get_default_collprovider();
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
result = false;
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ char *localeptr = setlocale(LC_CTYPE, NULL);
+
+ if (!localeptr)
+ elog(ERROR, "invalid LC_CTYPE setting");
+
+ result = locale_is_c(localeptr);
+ }
return (bool) result;
}
@@ -1365,25 +1481,15 @@ pg_newlocale_from_collation(Oid collid)
else if (collform->collprovider == COLLPROVIDER_ICU)
{
#ifdef USE_ICU
- UCollator *collator;
- UErrorCode status;
-
if (strcmp(collcollate, collctype) != 0)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("collations with different collate and ctype values are not supported by ICU")));
- status = U_ZERO_ERROR;
- collator = ucol_open(collcollate, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not open collator for locale \"%s\": %s",
- collcollate, u_errorName(status))));
-
/* We will leak this string if we get an error below :-( */
result.info.icu.locale = MemoryContextStrdup(TopMemoryContext,
collcollate);
- result.info.icu.ucol = collator;
+ result.info.icu.ucol = open_collator(collcollate);
#else /* not USE_ICU */
/* could get here if a collation was created by a build with ICU */
ereport(ERROR,
@@ -1440,46 +1546,6 @@ pg_newlocale_from_collation(Oid collid)
return cache_entry->locale;
}
-/*
- * Get provider-specific collation version string for the given collation from
- * the operating system/library.
- *
- * A particular provider must always either return a non-NULL string or return
- * NULL (if it doesn't support versions). It must not return NULL for some
- * collcollate and not NULL for others.
- */
-char *
-get_collation_actual_version(char collprovider, const char *collcollate)
-{
- char *collversion;
-
-#ifdef USE_ICU
- if (collprovider == COLLPROVIDER_ICU)
- {
- UCollator *collator;
- UErrorCode status;
- UVersionInfo versioninfo;
- char buf[U_MAX_VERSION_STRING_LENGTH];
-
- status = U_ZERO_ERROR;
- collator = ucol_open(collcollate, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not open collator for locale \"%s\": %s",
- collcollate, u_errorName(status))));
- ucol_getVersion(collator, versioninfo);
- ucol_close(collator);
-
- u_versionToString(versioninfo, buf);
- collversion = pstrdup(buf);
- }
- else
-#endif
- collversion = NULL;
-
- return collversion;
-}
-
#ifdef USE_ICU
/*
@@ -1761,3 +1827,125 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
return result;
}
+
+#ifdef USE_ICU
+/*
+ * If locale is "" return the environment value from setlocale().
+ *
+ * Otherwise return a malloc'd copy of locale if it is not NULL.
+ */
+static char *
+check_icu_locale(const char *locale)
+{
+ char *canonname = NULL;
+ char *winlocale = NULL;
+ char *result;
+
+ /* Windows locales can be in the format ".codepage" */
+ if (locale && (strlen(locale) == 0 || locale[0] == '.'))
+ {
+ check_locale(LC_COLLATE, locale, &canonname, COLLPROVIDER_LIBC);
+ locale = (const char *) canonname;
+ }
+
+#ifdef WIN32
+ if (!locale_is_c(locale))
+ {
+ winlocale = check_icu_winlocale(locale);
+ locale = (const char *) winlocale;
+ }
+#endif
+
+ result = locale ? pstrdup(locale) : NULL;
+
+ if (canonname)
+ pfree(canonname);
+ if (winlocale)
+ pfree(winlocale);
+
+ return result;
+}
+
+/*
+ * Get the default icu collation.
+ */
+const char *
+get_icu_default_collate(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static char result[NAMEDATALEN];
+ static bool cached = false;
+ const char *locale,
+ *collate;
+ char *langtag;
+
+ if (cached)
+ return result;
+
+ locale = uloc_getDefault();
+ if (!locale)
+ ereport(ERROR, (errmsg("ICU error: uloc_getDefault() failed")));
+
+ langtag = get_icu_language_tag(locale);
+ collate = get_icu_collate(locale, langtag);
+
+ if (strlen(collate) >= NAMEDATALEN)
+ ereport(FATAL,
+ (errmsg("the default ICU collation name \"%s\" is too long", collate)));
+
+ strcpy(result, collate);
+ cached = true;
+
+ pfree(langtag);
+ return result;
+}
+
+/*
+ * Get the collator for the default ICU collation.
+ */
+UCollator *
+get_default_collation_collator(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static UCollator *collator = NULL;
+
+ if (collator)
+ return collator;
+
+ collator = open_collator(get_icu_default_collate());
+ return collator;
+}
+#endif /* USE_ICU */
+
+/*
+ * Get the default collation provider.
+ */
+char
+get_default_collprovider(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static char result = '\0';
+ HeapTuple tp;
+ Form_pg_database dbform;
+ char *datcollate;
+
+ if (result)
+ return result;
+
+ tp = SearchSysCache1(DATABASEOID, ObjectIdGetDatum(MyDatabaseId));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for database %u", MyDatabaseId);
+
+ dbform = (Form_pg_database) GETSTRUCT(tp);
+ datcollate = NameStr(dbform->datcollate);
+ check_locale_collprovider(datcollate, NULL, &result, NULL);
+
+ if (!is_valid_nondefault_collprovider(result))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of database \"%s\"",
+ datcollate, NameStr(dbform->datname))));
+
+ ReleaseSysCache(tp);
+ return result;
+}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index f1c78ff..ad7f08a 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5736,13 +5736,14 @@ find_join_input_rel(PlannerInfo *root, Relids relids)
*/
static int
pattern_char_isalpha(char c, bool is_multibyte,
- pg_locale_t locale, bool locale_is_c)
+ pg_locale_t locale, char collprovider, bool locale_is_c)
{
if (locale_is_c)
return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
else if (is_multibyte && IS_HIGHBIT_SET(c))
return true;
- else if (locale && locale->provider == COLLPROVIDER_ICU)
+ else if (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII)
return IS_HIGHBIT_SET(c) ? true : false;
#ifdef HAVE_LOCALE_T
else if (locale && locale->provider == COLLPROVIDER_LIBC)
@@ -5778,6 +5779,7 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
bool is_multibyte = (pg_database_encoding_max_length() > 1);
pg_locale_t locale = 0;
bool locale_is_c = false;
+ char collprovider = COLLPROVIDER_LIBC;
/* the right-hand const is type text or bytea */
Assert(typeid == BYTEAOID || typeid == TEXTOID);
@@ -5806,6 +5808,11 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collation);
+ collprovider = locale->provider;
+ }
+ else
+ {
+ collprovider = get_default_collprovider();
}
}
@@ -5843,7 +5850,8 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
/* Stop if case-varying character (it's sort of a wildcard) */
if (case_insensitive &&
- pattern_char_isalpha(patt[pos], is_multibyte, locale, locale_is_c))
+ pattern_char_isalpha(patt[pos], is_multibyte, locale,
+ collprovider, locale_is_c))
break;
match[match_pos++] = patt[pos];
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index a5e812d..136155f 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1401,8 +1401,15 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
char *a1p,
*a2p;
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1416,8 +1423,15 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
/*
* memcmp() can't tell us which of two unequal strings sorts first,
* but it's a cheap way to tell if they're equal. Testing shows that
@@ -1432,8 +1446,7 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
#ifdef WIN32
/* Win32 does not have UTF-8, so we need to map to UTF-16 */
- if (GetDatabaseEncoding() == PG_UTF8
- && (!mylocale || mylocale->provider == COLLPROVIDER_LIBC))
+ if (GetDatabaseEncoding() == PG_UTF8 && use_libc)
{
int a1len;
int a2len;
@@ -1535,60 +1548,67 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
memcpy(a2p, arg2, len2);
a2p[len2] = '\0';
- if (mylocale)
+ if (use_icu)
{
- if (mylocale->provider == COLLPROVIDER_ICU)
- {
#ifdef USE_ICU
+ UCollator *collator;
+
+ if (mylocale)
+ collator = mylocale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
+
#ifdef HAVE_UCOL_STRCOLLUTF8
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- UErrorCode status;
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ UErrorCode status;
- status = U_ZERO_ERROR;
- result = ucol_strcollUTF8(mylocale->info.icu.ucol,
- arg1, len1,
- arg2, len2,
- &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("collation failed: %s", u_errorName(status))));
- }
- else
+ status = U_ZERO_ERROR;
+ result = ucol_strcollUTF8(collator,
+ arg1, len1,
+ arg2, len2,
+ &status);
+ if (U_FAILURE(status))
+ ereport(ERROR,
+ (errmsg("collation failed: %s", u_errorName(status))));
+ }
+ else
#endif
- {
- int32_t ulen1,
- ulen2;
- UChar *uchar1,
- *uchar2;
+ {
+ int32_t ulen1,
+ ulen2;
+ UChar *uchar1,
+ *uchar2;
- ulen1 = icu_to_uchar(&uchar1, arg1, len1);
- ulen2 = icu_to_uchar(&uchar2, arg2, len2);
+ ulen1 = icu_to_uchar(&uchar1, arg1, len1);
+ ulen2 = icu_to_uchar(&uchar2, arg2, len2);
- result = ucol_strcoll(mylocale->info.icu.ucol,
- uchar1, ulen1,
- uchar2, ulen2);
+ result = ucol_strcoll(collator,
+ uchar1, ulen1,
+ uchar2, ulen2);
- pfree(uchar1);
- pfree(uchar2);
- }
+ pfree(uchar1);
+ pfree(uchar2);
+ }
#else /* not USE_ICU */
- /* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif /* not USE_ICU */
- }
- else
- {
+ }
+ else
+ {
+ /* use_libc */
+
+ if (mylocale)
#ifdef HAVE_LOCALE_T
result = strcoll_l(a1p, a2p, mylocale->info.lt);
#else
/* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- }
+ else
+ result = strcoll(a1p, a2p);
}
- else
- result = strcoll(a1p, a2p);
/*
* In some locales strcoll() can claim that nonidentical strings are
@@ -1838,6 +1858,9 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
bool collate_c = false;
VarStringSortSupport *sss;
pg_locale_t locale = 0;
+ char collprovider = '\0';
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY = false;
+ bool use_icu = false;
/*
* If possible, set ssup->comparator to a function which can be used to
@@ -1867,7 +1890,11 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
* we'll figure out the collation based on the locale id and cache the
* result.
*/
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1881,8 +1908,15 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collid);
+ collprovider = locale->provider;
}
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
/*
* There is a further exception on Windows. When the database
* encoding is UTF-8 and we are not using the C collation, complex
@@ -1892,8 +1926,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
* trampoline. ICU locales work just the same on Windows, however.
*/
#ifdef WIN32
- if (GetDatabaseEncoding() == PG_UTF8 &&
- !(locale && locale->provider == COLLPROVIDER_ICU))
+ if (GetDatabaseEncoding() == PG_UTF8 && use_libc)
return;
#endif
@@ -1922,7 +1955,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
* platforms.
*/
#ifndef TRUST_STRXFRM
- if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
+ if (!collate_c && !use_icu)
abbreviate = false;
#endif
@@ -2064,6 +2097,9 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
VarString *arg2 = DatumGetVarStringPP(y);
bool arg1_match;
VarStringSortSupport *sss = (VarStringSortSupport *) ssup->ssup_extra;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
/* working state */
char *a1p,
@@ -2157,59 +2193,77 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
}
if (sss->locale)
+ collprovider = sss->locale->provider;
+ else
+ collprovider = get_default_collprovider();
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
- if (sss->locale->provider == COLLPROVIDER_ICU)
- {
#ifdef USE_ICU
-#ifdef HAVE_UCOL_STRCOLLUTF8
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- UErrorCode status;
+ UCollator *collator;
- status = U_ZERO_ERROR;
- result = ucol_strcollUTF8(sss->locale->info.icu.ucol,
- a1p, len1,
- a2p, len2,
- &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("collation failed: %s", u_errorName(status))));
- }
- else
+ if (sss->locale)
+ collator = sss->locale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
+
+#ifdef HAVE_UCOL_STRCOLLUTF8
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ UErrorCode status;
+
+ status = U_ZERO_ERROR;
+ result = ucol_strcollUTF8(collator,
+ a1p, len1,
+ a2p, len2,
+ &status);
+ if (U_FAILURE(status))
+ ereport(ERROR,
+ (errmsg("collation failed: %s", u_errorName(status))));
+ }
+ else
#endif
- {
- int32_t ulen1,
- ulen2;
- UChar *uchar1,
- *uchar2;
+ {
+ int32_t ulen1,
+ ulen2;
+ UChar *uchar1,
+ *uchar2;
- ulen1 = icu_to_uchar(&uchar1, a1p, len1);
- ulen2 = icu_to_uchar(&uchar2, a2p, len2);
+ ulen1 = icu_to_uchar(&uchar1, a1p, len1);
+ ulen2 = icu_to_uchar(&uchar2, a2p, len2);
- result = ucol_strcoll(sss->locale->info.icu.ucol,
- uchar1, ulen1,
- uchar2, ulen2);
+ result = ucol_strcoll(collator,
+ uchar1, ulen1,
+ uchar2, ulen2);
- pfree(uchar1);
- pfree(uchar2);
- }
+ pfree(uchar1);
+ pfree(uchar2);
+ }
#else /* not USE_ICU */
- /* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif /* not USE_ICU */
- }
- else
- {
+ }
+ else
+ {
+ /* use_libc */
+
+ if (sss->locale)
#ifdef HAVE_LOCALE_T
result = strcoll_l(sss->buf1, sss->buf2, sss->locale->info.lt);
#else
/* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- }
+ else
+ result = strcoll(sss->buf1, sss->buf2);
}
- else
- result = strcoll(sss->buf1, sss->buf2);
/*
* In some locales strcoll() can claim that nonidentical strings are
@@ -2314,6 +2368,9 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
else
{
Size bsize;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
#ifdef USE_ICU
int32_t ulen = -1;
UChar *uchar = NULL;
@@ -2350,10 +2407,20 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
sss->buf1[len] = '\0';
sss->last_len1 = len;
+ if (sss->locale)
+ collprovider = sss->locale->provider;
+ else
+ collprovider = get_default_collprovider();
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
#ifdef USE_ICU
/* When using ICU and not UTF8, convert string to UChar. */
- if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU &&
- GetDatabaseEncoding() != PG_UTF8)
+ if (use_icu && GetDatabaseEncoding() != PG_UTF8)
ulen = icu_to_uchar(&uchar, sss->buf1, len);
#endif
@@ -2367,9 +2434,15 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
*/
for (;;)
{
-#ifdef USE_ICU
- if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+ if (use_icu)
{
+#ifdef USE_ICU
+ UCollator *collator;
+
+ if (sss->locale)
+ collator = sss->locale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
/*
* When using UTF8, use the iteration interface so we only
* need to produce as many bytes as we actually need.
@@ -2383,7 +2456,7 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
uiter_setUTF8(&iter, sss->buf1, len);
state[0] = state[1] = 0; /* won't need that again */
status = U_ZERO_ERROR;
- bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol,
+ bsize = ucol_nextSortKeyPart(collator,
&iter,
state,
(uint8_t *) sss->buf2,
@@ -2395,19 +2468,26 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
u_errorName(status))));
}
else
- bsize = ucol_getSortKey(sss->locale->info.icu.ucol,
+ bsize = ucol_getSortKey(collator,
uchar, ulen,
(uint8_t *) sss->buf2, sss->buflen2);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
+ {
+ /* use_libc */
+
#ifdef HAVE_LOCALE_T
- if (sss->locale && sss->locale->provider == COLLPROVIDER_LIBC)
- bsize = strxfrm_l(sss->buf2, sss->buf1,
- sss->buflen2, sss->locale->info.lt);
- else
+ if (sss->locale)
+ bsize = strxfrm_l(sss->buf2, sss->buf1,
+ sss->buflen2, sss->locale->info.lt);
+ else
#endif
- bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
+ bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
+ }
sss->last_len2 = bsize;
if (bsize < sss->buflen2)
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 5ef6315..1f2df13 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -29,9 +29,11 @@
#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
#include "catalog/pg_db_role_setting.h"
#include "catalog/pg_tablespace.h"
+#include "common/pg_collation_fn_common.h"
#include "libpq/auth.h"
#include "libpq/libpq-be.h"
#include "mb/pg_wchar.h"
@@ -296,6 +298,13 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
Form_pg_database dbform;
char *collate;
char *ctype;
+ char *datcollate;
+ char collprovider;
+ char *collversion;
+ char *wincollate = NULL;
+ char *langtag = NULL;
+ const char *collcollate;
+ char *actual_versionstr;
/* Fetch our pg_database row normally, via syscache */
tup = SearchSysCache1(DATABASEOID, ObjectIdGetDatum(MyDatabaseId));
@@ -377,27 +386,124 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
PGC_BACKEND, PGC_S_DYNAMIC_DEFAULT);
/* assign locale variables */
- collate = NameStr(dbform->datcollate);
ctype = NameStr(dbform->datctype);
+ datcollate = NameStr(dbform->datcollate);
+ check_locale_collprovider(datcollate, &collate, &collprovider,
+ &collversion);
- if (pg_perm_setlocale(LC_COLLATE, collate) == NULL)
+ if (!is_valid_nondefault_collprovider(collprovider))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of database \"%s\"",
+ datcollate, name)));
+
+#ifndef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ ereport(FATAL,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"), \
+ errhint("Recreate the database with libc locale or rebuild PostgreSQL using --with-icu.")));
+#endif
+
+ /* we always check lc_collate for libc */
+ if (pg_perm_setlocale(LC_COLLATE, collate, COLLPROVIDER_LIBC) == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
- errdetail("The database was initialized with LC_COLLATE \"%s\", "
- " which is not recognized by setlocale().", collate),
+ errdetail("The database was initialized with LC_COLLATE \"%s\" (provider \"%s\"), "
+ " which is not recognized by setlocale().",
+ collate, get_collprovider_name(COLLPROVIDER_LIBC)),
errhint("Recreate the database with another locale or install the missing locale.")));
- if (pg_perm_setlocale(LC_CTYPE, ctype) == NULL)
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ if (pg_perm_setlocale(LC_COLLATE, collate, collprovider) == NULL)
+ ereport(FATAL,
+ (errmsg("database locale is incompatible with operating system"),
+ errdetail("The database was initialized with LC_COLLATE \"%s\" (provider \"%s\"), "
+ " which is not recognized by uloc_setDefault().",
+ collate, get_collprovider_name(collprovider)),
+ errhint("Recreate the database with another locale or install the missing locale.")));
+
+ /* This could happen when manually creating a mess in the catalogs. */
+ if (strcmp(collate, ctype) != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("collations with different collate and ctype values are not supported by ICU")));
+ }
+
+ if (pg_perm_setlocale(LC_CTYPE, ctype, '\0') == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
errdetail("The database was initialized with LC_CTYPE \"%s\", "
" which is not recognized by setlocale().", ctype),
errhint("Recreate the database with another locale or install the missing locale.")));
+ /* get the actual version of the collation */
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ collcollate = (const char *) collate;
+#ifdef WIN32
+ if (!locale_is_c(collcollate))
+ {
+ wincollate = check_icu_winlocale(collcollate);
+ collcollate = (const char *) wincollate;
+ }
+#endif /* WIN32 */
+ langtag = get_icu_language_tag(collcollate);
+ collcollate = get_icu_collate(collcollate, langtag);
+ }
+ else
+#endif /* USE_ICU */
+ {
+ /* COLLPROVIDER_LIBC */
+ collcollate = (const char *) collate;
+ }
+
+ actual_versionstr = get_collation_actual_version(collprovider, collcollate);
+
+ /*
+ * Check the collation version (this matches the version checking in the
+ * function pg_newlocale_from_collation())
+ */
+ if (collversion)
+ {
+ if (!actual_versionstr)
+ {
+ /*
+ * This could happen when manually creating a mess in the catalogs.
+ */
+ ereport(ERROR,
+ (errmsg("collation \"%s\" (provider \"%s\") has no actual version, but a version was specified",
+ collate, get_collprovider_name(collprovider))));
+ }
+
+ if (strcmp(actual_versionstr, collversion) != 0)
+ ereport(ERROR,
+ (errmsg("collation \"%s\" (provider \"%s\") has version mismatch",
+ collate, get_collprovider_name(collprovider)),
+ errdetail("The collation in the database was created using version %s, "
+ "but the operating system provides version %s.",
+ collversion, actual_versionstr),
+ errhint("Build PostgreSQL with the right library version.")));
+ }
+
/* Make the locale settings visible as GUC variables, too */
- SetConfigOption("lc_collate", collate, PGC_INTERNAL, PGC_S_OVERRIDE);
+ SetConfigOption("lc_collate", datcollate, PGC_INTERNAL, PGC_S_OVERRIDE);
SetConfigOption("lc_ctype", ctype, PGC_INTERNAL, PGC_S_OVERRIDE);
+ pfree(collate);
+ if (collversion)
+ pfree(collversion);
+ if (langtag)
+ pfree(langtag);
+ if (actual_versionstr)
+ pfree(actual_versionstr);
+ if (wincollate)
+ pfree(wincollate);
+
check_strxfrm_bug();
ReleaseSysCache(tup);
diff --git a/src/backend/utils/mb/encnames.c b/src/backend/utils/mb/encnames.c
index 12b61cd..1e75257 100644
--- a/src/backend/utils/mb/encnames.c
+++ b/src/backend/utils/mb/encnames.c
@@ -403,8 +403,6 @@ const pg_enc2gettext pg_enc2gettext_tbl[] =
};
-#ifndef FRONTEND
-
/*
* Table of encoding names for ICU
*
@@ -457,6 +455,7 @@ is_encoding_supported_by_icu(int encoding)
return (pg_enc2icu_tbl[encoding] != NULL);
}
+#ifndef FRONTEND
const char *
get_encoding_name_for_icu(int encoding)
{
@@ -475,7 +474,6 @@ get_encoding_name_for_icu(int encoding)
return icu_encoding_name;
}
-
#endif /* not FRONTEND */
diff --git a/src/bin/initdb/Makefile b/src/bin/initdb/Makefile
index 8c23941..3733399 100644
--- a/src/bin/initdb/Makefile
+++ b/src/bin/initdb/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) -I$(top_srcdir)/src/timezone $(CPPFLAGS)
# note: we need libpq only because fe_utils does
-LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
# use system timezone data?
ifneq (,$(with_system_tzdata))
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 3f203c6..dba7b8c 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -55,6 +55,10 @@
#include <signal.h>
#include <time.h>
+#ifdef USE_ICU
+#include <unicode/uloc.h>
+#endif
+
#ifdef HAVE_SHM_OPEN
#include "sys/mman.h"
#endif
@@ -65,6 +69,7 @@
#include "catalog/pg_collation_d.h"
#include "common/file_perm.h"
#include "common/file_utils.h"
+#include "common/pg_collation_fn_common.h"
#include "common/restricted_token.h"
#include "common/username.h"
#include "fe_utils/string_utils.h"
@@ -144,6 +149,8 @@ static bool data_checksums = false;
static char *xlog_dir = NULL;
static char *str_wal_segment_size_mb = NULL;
static int wal_segment_size_mb;
+static char collprovider = '\0';
+static char *collversion = NULL;
/* internal vars */
@@ -268,10 +275,15 @@ static char *escape_quotes(const char *src);
static char *escape_quotes_bki(const char *src);
static int locale_date_order(const char *locale);
static void check_locale_name(int category, const char *locale,
- char **canonname);
-static bool check_locale_encoding(const char *locale, int encoding);
+ char **canonname, char collprovider);
+static bool check_locale_encoding(const char *locale, int encoding,
+ char collprovider);
static void setlocales(void);
static void usage(const char *progname);
+#ifdef USE_ICU
+static char *check_icu_locale_name(const char *locale);
+#endif
+static void set_collation_version(void);
void setup_pgdata(void);
void setup_bin_paths(const char *argv0);
void setup_data_file_paths(void);
@@ -1387,10 +1399,27 @@ bootstrap_template1(void)
char **bki_lines;
char headerline[MAXPGPATH];
char buf[64];
+ char *lc_collate_full_name;
printf(_("running bootstrap script ... "));
fflush(stdout);
+ Assert(lc_collate);
+
+ lc_collate_full_name = get_full_collation_name(lc_collate, collprovider,
+ collversion);
+
+ if (!lc_collate_full_name)
+ exit(1); /* get_full_collation_name printed the error */
+
+ if (strlen(lc_collate_full_name) >= NAMEDATALEN)
+ {
+ fprintf(stderr,
+ _("%s: the full collation name \"%s\" is too long\n"),
+ progname, lc_collate_full_name);
+ exit(1);
+ }
+
bki_lines = readfile(bki_file);
/* Check that bki file appears to be of the right version */
@@ -1432,7 +1461,7 @@ bootstrap_template1(void)
encodingid_to_string(encodingid));
bki_lines = replace_token(bki_lines, "LC_COLLATE",
- escape_quotes_bki(lc_collate));
+ escape_quotes_bki(lc_collate_full_name));
bki_lines = replace_token(bki_lines, "LC_CTYPE",
escape_quotes_bki(lc_ctype));
@@ -1474,6 +1503,7 @@ bootstrap_template1(void)
PG_CMD_CLOSE;
free(bki_lines);
+ free(lc_collate_full_name);
check_ok();
}
@@ -2224,53 +2254,143 @@ locale_date_order(const char *locale)
* the locale name, but typically it doesn't.)
*
* this should match the backend's check_locale() function
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
static void
-check_locale_name(int category, const char *locale, char **canonname)
+check_locale_name(int category, const char *locale, char **canonname,
+ char collprovider)
{
- char *save;
- char *res;
+ const char *save;
+ const char *res;
+ char *save_dup;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+ bool failure = false;
+#ifdef USE_ICU
+ UErrorCode status;
+ char *icu_locale;
+#endif
- if (canonname)
- *canonname = NULL; /* in case of failure */
+ Assert(use_libc || use_icu);
- save = setlocale(category, NULL);
- if (!save)
+#ifndef USE_ICU
+ if (use_icu)
{
- fprintf(stderr, _("%s: setlocale() failed\n"),
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
progname);
exit(1);
}
+#endif
+
+ if (canonname)
+ *canonname = NULL; /* in case of failure */
+
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ save = uloc_getDefault();
+ if (!save)
+ {
+ fprintf(stderr, _("%s: ICU error: uloc_getDefault() failed\n"),
+ progname);
+ exit(1);
+ }
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ save = setlocale(category, NULL);
+ if (!save)
+ {
+ fprintf(stderr, _("%s: setlocale() failed\n"),
+ progname);
+ exit(1);
+ }
+ }
/* save may be pointing at a modifiable scratch variable, so copy it. */
- save = pg_strdup(save);
+ save_dup = pg_strdup(save);
/* for setlocale() call */
if (!locale)
locale = "";
/* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ icu_locale = check_icu_locale_name(locale);
+ if (icu_locale == NULL && locale != NULL)
+ {
+ failure = true;
+ res = NULL;
+ }
+ else
+ {
+ status = U_ZERO_ERROR;
+ uloc_setDefault(icu_locale, &status);
+ res = uloc_getDefault();
+ failure = (U_FAILURE(status) || res == NULL);
+ if (icu_locale)
+ pfree(icu_locale);
+ }
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ res = setlocale(category, locale);
+ failure = (res == NULL);
+ }
/* save canonical name if requested. */
if (res && canonname)
*canonname = pg_strdup(res);
/* restore old value. */
- if (!setlocale(category, save))
+#ifdef USE_ICU
+ if (use_icu)
{
- fprintf(stderr, _("%s: failed to restore old locale \"%s\"\n"),
- progname, save);
- exit(1);
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ {
+ fprintf(stderr, _("%s: ICU error: failed to restore old locale \"%s\"\n"),
+ progname, save_dup);
+ exit(1);
+ }
}
- free(save);
+ else
+#endif
+ {
+ /* use_libc */
+ if (!setlocale(category, save_dup))
+ {
+ fprintf(stderr, _("%s: failed to restore old locale \"%s\"\n"),
+ progname, save_dup);
+ exit(1);
+ }
+ }
+ free(save_dup);
/* complain if locale wasn't valid */
- if (res == NULL)
+ if (failure)
{
if (*locale)
- fprintf(stderr, _("%s: invalid locale name \"%s\"\n"),
- progname, locale);
+ {
+ if (category == LC_COLLATE)
+ fprintf(stderr, _("%s: invalid locale name \"%s\" (provider \"%s\")\n"),
+ progname, locale, get_collprovider_name(collprovider));
+ else
+ fprintf(stderr, _("%s: invalid locale name \"%s\"\n"),
+ progname, locale);
+ }
else
{
/*
@@ -2292,9 +2412,11 @@ check_locale_name(int category, const char *locale, char **canonname)
* check if the chosen encoding matches the encoding required by the locale
*
* this should match the similar check in the backend createdb() function
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
static bool
-check_locale_encoding(const char *locale, int user_enc)
+check_locale_encoding(const char *locale, int user_enc, char collprovider)
{
int locale_enc;
@@ -2321,6 +2443,25 @@ check_locale_encoding(const char *locale, int user_enc)
progname);
return false;
}
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ if (!is_encoding_supported_by_icu(user_enc))
+ {
+ fprintf(stderr, _("%s: selected encoding (%s) is not supported for ICU locales\n"),
+ progname, pg_encoding_to_char(user_enc));
+ return false;
+ }
+#else /* not USE_ICU */
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
+ progname);
+ exit(1);
+#endif /* not USE_ICU */
+ }
+
return true;
}
@@ -2332,16 +2473,22 @@ check_locale_encoding(const char *locale, int user_enc)
static void
setlocales(void)
{
- char *canonname;
-
- /* set empty lc_* values to locale config if set */
+ char *canonname = NULL;
if (locale)
{
+ /*
+ * Set up the collation provider if possible and canonicalize the locale
+ * name.
+ */
+ check_locale_collprovider(locale, &canonname, &collprovider, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ locale = canonname;
+
+ /* set empty lc_* values to locale config if set */
if (!lc_ctype)
lc_ctype = locale;
- if (!lc_collate)
- lc_collate = locale;
if (!lc_numeric)
lc_numeric = locale;
if (!lc_time)
@@ -2352,29 +2499,83 @@ setlocales(void)
lc_messages = locale;
}
+ if (lc_collate)
+ {
+ /*
+ * Set up the collation provider if possible and canonicalize the locale
+ * name.
+ */
+ check_locale_collprovider(lc_collate, &canonname, &collprovider, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ lc_collate = canonname;
+ }
+ else if (canonname)
+ {
+ /* we have already canonicalized the locale name */
+ lc_collate = pstrdup(canonname);
+ }
+
/*
* canonicalize locale names, and obtain any missing values from our
* current environment
*/
- check_locale_name(LC_CTYPE, lc_ctype, &canonname);
+ check_locale_name(LC_CTYPE, lc_ctype, &canonname, '\0');
lc_ctype = canonname;
- check_locale_name(LC_COLLATE, lc_collate, &canonname);
+
+ /* we always check lc_collate for libc */
+ check_locale_name(LC_COLLATE, lc_collate, &canonname, COLLPROVIDER_LIBC);
+ if (lc_collate)
+ pfree(lc_collate);
lc_collate = canonname;
- check_locale_name(LC_NUMERIC, lc_numeric, &canonname);
+
+ /* determine the collation provider if we haven't already done it */
+ if (!is_valid_nondefault_collprovider(collprovider))
+ {
+#ifdef USE_ICU
+ if (!locale_is_c(lc_collate))
+ {
+ collprovider = COLLPROVIDER_ICU;
+ }
+ else
+#endif
+ {
+ collprovider = COLLPROVIDER_LIBC;
+ }
+ }
+
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ check_locale_name(LC_COLLATE, lc_collate, NULL, collprovider);
+ if (strcmp(lc_collate, lc_ctype) != 0)
+ {
+ fprintf(stderr,
+ _("%s: collations with different collate and ctype values are not supported by ICU\n"),
+ progname);
+ exit(1);
+ }
+ }
+
+ check_locale_name(LC_NUMERIC, lc_numeric, &canonname, '\0');
lc_numeric = canonname;
- check_locale_name(LC_TIME, lc_time, &canonname);
+ check_locale_name(LC_TIME, lc_time, &canonname, '\0');
lc_time = canonname;
- check_locale_name(LC_MONETARY, lc_monetary, &canonname);
+ check_locale_name(LC_MONETARY, lc_monetary, &canonname, '\0');
lc_monetary = canonname;
#if defined(LC_MESSAGES) && !defined(WIN32)
- check_locale_name(LC_MESSAGES, lc_messages, &canonname);
+ check_locale_name(LC_MESSAGES, lc_messages, &canonname, '\0');
lc_messages = canonname;
#else
/* when LC_MESSAGES is not available, use the LC_CTYPE setting */
- check_locale_name(LC_CTYPE, lc_messages, &canonname);
+ check_locale_name(LC_CTYPE, lc_messages, &canonname, '\0');
lc_messages = canonname;
#endif
+
+ set_collation_version();
}
/*
@@ -2592,6 +2793,9 @@ setup_locale_encoding(void)
lc_time);
}
+ printf(_("The default collation provider is \"%s\".\n"),
+ get_collprovider_name(collprovider));
+
if (!encoding)
{
int ctype_enc;
@@ -2642,8 +2846,8 @@ setup_locale_encoding(void)
else
encodingid = get_encoding_id(encoding);
- if (!check_locale_encoding(lc_ctype, encodingid) ||
- !check_locale_encoding(lc_collate, encodingid))
+ if (!check_locale_encoding(lc_ctype, encodingid, '\0') ||
+ !check_locale_encoding(lc_collate, encodingid, collprovider))
exit(1); /* check_locale_encoding printed the error */
}
@@ -3419,3 +3623,113 @@ main(int argc, char *argv[])
return 0;
}
+
+#ifdef USE_ICU
+/*
+ * If locale is "" return the environment value from setlocale().
+ *
+ * Otherwise return a malloc'd copy of locale if it is not NULL.
+ *
+ * This should match the backend's check_icu_locale() function.
+ */
+static char *
+check_icu_locale_name(const char *locale)
+{
+ char *canonname = NULL;
+ char *winlocale = NULL;
+ char *result;
+
+ /* Windows locales can be in the format ".codepage" */
+ if (locale && (strlen(locale) == 0 || locale[0] == '.'))
+ {
+ check_locale_name(LC_COLLATE, locale, &canonname, COLLPROVIDER_LIBC);
+ locale = (const char *) canonname;
+ }
+
+#ifdef WIN32
+ if (!locale_is_c(locale))
+ {
+ winlocale = check_icu_winlocale(locale);
+
+ if (winlocale == NULL && locale != NULL)
+ exit(1); /* check_icu_winlocale printed the error */
+ else
+ locale = winlocale;
+ }
+#endif
+
+ result = locale ? pstrdup(locale) : NULL;
+
+ if (canonname)
+ pfree(canonname);
+ if (winlocale)
+ pfree(winlocale);
+
+ return result;
+}
+#endif /* USE_ICU */
+
+/*
+ * Setup the lc_collate version (get it from the collation provider).
+ */
+static void
+set_collation_version(void)
+{
+ char *wincollate = NULL;
+ char *langtag = NULL;
+ const char *collate;
+ bool failure;
+
+ Assert(lc_collate);
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ collate = (const char *) lc_collate;
+
+#ifdef WIN32
+ if (!locale_is_c(collate))
+ {
+ wincollate = check_icu_winlocale(collate);
+
+ if (wincollate == NULL && collate != NULL)
+ exit(1); /* check_icu_winlocale printed the error */
+ else
+ collate = (const char *) wincollate;
+ }
+#endif /* WIN32 */
+
+ langtag = get_icu_language_tag(collate);
+ if (!langtag)
+ {
+ /* get_icu_language_tag printed the main error message */
+ fprintf(stderr, _("Rerun %s with a different locale selection.\n"),
+ progname);
+ exit(1);
+ }
+ collate = get_icu_collate(collate, langtag);
+#else /* not USE_ICU */
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
+ progname);
+ exit(1);
+#endif /* not USE_ICU */
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ collate = (const char *) lc_collate;
+ }
+
+ get_collation_actual_version(collprovider, collate, &collversion, &failure);
+ if (failure)
+ /* get_collation_actual_version printed the error */
+ exit(1);
+
+ if (langtag)
+ free(langtag);
+ if (wincollate)
+ free(wincollate);
+}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 20e8aed..a42ee22 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -47,12 +47,14 @@
#include "catalog/pg_attribute_d.h"
#include "catalog/pg_cast_d.h"
#include "catalog/pg_class_d.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_default_acl_d.h"
#include "catalog/pg_largeobject_d.h"
#include "catalog/pg_largeobject_metadata_d.h"
#include "catalog/pg_proc_d.h"
#include "catalog/pg_trigger_d.h"
#include "catalog/pg_type_d.h"
+#include "common/pg_collation_fn_common.h"
#include "libpq/libpq-fs.h"
#include "dumputils.h"
@@ -13326,9 +13328,10 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
int i_collprovider;
int i_collcollate;
int i_collctype;
- const char *collprovider;
+ const char *collproviderstr;
const char *collcollate;
const char *collctype;
+ const char *collprovider_name;
/* Skip if not to be dumped */
if (!collinfo->dobj.dump || dopt->dataOnly)
@@ -13366,28 +13369,28 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
i_collcollate = PQfnumber(res, "collcollate");
i_collctype = PQfnumber(res, "collctype");
- collprovider = PQgetvalue(res, 0, i_collprovider);
+ collproviderstr = PQgetvalue(res, 0, i_collprovider);
collcollate = PQgetvalue(res, 0, i_collcollate);
collctype = PQgetvalue(res, 0, i_collctype);
+ /*
+ * Use COLLPROVIDER_DEFAULT to allow dumping pg_catalog; not accepted on
+ * input
+ */
+ collprovider_name = get_collprovider_name(collproviderstr[0]);
+ if (!collprovider_name)
+ exit_horribly(NULL,
+ "unrecognized collation provider: %s\n",
+ collproviderstr);
+
+
appendPQExpBuffer(delq, "DROP COLLATION %s;\n",
fmtQualifiedDumpable(collinfo));
appendPQExpBuffer(q, "CREATE COLLATION %s (",
fmtQualifiedDumpable(collinfo));
- appendPQExpBufferStr(q, "provider = ");
- if (collprovider[0] == 'c')
- appendPQExpBufferStr(q, "libc");
- else if (collprovider[0] == 'i')
- appendPQExpBufferStr(q, "icu");
- else if (collprovider[0] == 'd')
- /* to allow dumping pg_catalog; not accepted on input */
- appendPQExpBufferStr(q, "default");
- else
- exit_horribly(NULL,
- "unrecognized collation provider: %s\n",
- collprovider);
+ appendPQExpBuffer(q, "provider = %s", collprovider_name);
if (strcmp(collcollate, collctype) == 0)
{
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 80d8338..9484955 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -16,7 +16,9 @@
#include "catalog/pg_attribute_d.h"
#include "catalog/pg_class_d.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_default_acl_d.h"
+#include "common/pg_collation_fn_common.h"
#include "fe_utils/string_utils.h"
#include "common.h"
@@ -4074,7 +4076,13 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
if (pset.sversion >= 100000)
appendPQExpBuffer(&buf,
- ",\n CASE c.collprovider WHEN 'd' THEN 'default' WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\"",
+ ",\n CASE c.collprovider WHEN '%c' THEN '%s' WHEN '%c' THEN '%s' WHEN '%c' THEN '%s' END AS \"%s\"",
+ COLLPROVIDER_DEFAULT,
+ get_collprovider_name(COLLPROVIDER_DEFAULT),
+ COLLPROVIDER_LIBC,
+ get_collprovider_name(COLLPROVIDER_LIBC),
+ COLLPROVIDER_ICU,
+ get_collprovider_name(COLLPROVIDER_ICU),
gettext_noop("Provider"));
if (verbose)
diff --git a/src/bin/scripts/Makefile b/src/bin/scripts/Makefile
index 4c6e4b9..1e80f0b 100644
--- a/src/bin/scripts/Makefile
+++ b/src/bin/scripts/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
PROGRAMS = createdb createuser dropdb dropuser clusterdb vacuumdb reindexdb pg_isready
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
-LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
all: $(PROGRAMS)
diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index fc10888..816e322 100644
--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -58,6 +58,7 @@ main(int argc, char *argv[])
char *lc_collate = NULL;
char *lc_ctype = NULL;
char *locale = NULL;
+ char *canonname = NULL;
PQExpBufferData sql;
@@ -153,7 +154,15 @@ main(int argc, char *argv[])
progname);
exit(1);
}
- lc_ctype = locale;
+
+ /*
+ * remove the collation provider modifier from the locale for lc_ctype
+ */
+ check_locale_collprovider(locale, &canonname, NULL, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ lc_ctype = canonname;
+
lc_collate = locale;
}
@@ -241,6 +250,9 @@ main(int argc, char *argv[])
PQfinish(conn);
+ if (canonname)
+ pfree(canonname);
+
exit(0);
}
diff --git a/src/common/Makefile b/src/common/Makefile
index 1fc2c66..4ce4341 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -43,7 +43,7 @@ LIBS += $(PTHREAD_LIBS)
OBJS_COMMON = base64.o config_info.o controldata_utils.o exec.o file_perm.o \
ip.o keywords.o md5.o pg_lzcompress.o pgfnames.o psprintf.o relpath.o \
rmtree.o saslprep.o scram-common.o string.o unicode_norm.o \
- username.o wait_error.o
+ username.o wait_error.o pg_collation_fn_common.o
ifeq ($(with_openssl),yes)
OBJS_COMMON += sha2_openssl.o
diff --git a/src/common/pg_collation_fn_common.c b/src/common/pg_collation_fn_common.c
new file mode 100644
index 0000000..a3ba3a3
--- /dev/null
+++ b/src/common/pg_collation_fn_common.c
@@ -0,0 +1,90 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_collation_fn_common.c
+ * commmon routines to support manipulation of the pg_collation relation
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/common/pg_collation_fn_common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifdef FRONTEND
+#include "postgres_fe.h"
+#else
+#include "postgres.h"
+#endif
+
+#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
+
+
+/*
+ * Note that we search the table with pg_strcasecmp(), so variant
+ * capitalizations don't need their own entries.
+ */
+typedef struct collprovider_name
+{
+ char collprovider;
+ const char *name;
+} collprovider_name;
+
+static const collprovider_name collprovider_name_tbl[] =
+{
+ {COLLPROVIDER_DEFAULT, "default"},
+ {COLLPROVIDER_LIBC, "libc"},
+ {COLLPROVIDER_ICU, "icu"},
+ {'\0', NULL} /* end marker */
+};
+
+/*
+ * Get the collation provider from the given collation provider name.
+ *
+ * Return '\0' if we can't determine it.
+ */
+char
+get_collprovider(const char *name)
+{
+ int i;
+
+ if (!name)
+ return '\0';
+
+ /* Check the table */
+ for (i = 0; collprovider_name_tbl[i].name; ++i)
+ if (pg_strcasecmp(name, collprovider_name_tbl[i].name) == 0)
+ return collprovider_name_tbl[i].collprovider;
+
+ return '\0';
+}
+
+/*
+ * Get the name of the given collation provider.
+ *
+ * Return NULL if we can't determine it.
+ */
+const char *
+get_collprovider_name(char collprovider)
+{
+ int i;
+
+ /* Check the table */
+ for (i = 0; collprovider_name_tbl[i].collprovider; ++i)
+ if (collprovider_name_tbl[i].collprovider == collprovider)
+ return collprovider_name_tbl[i].name;
+
+ return NULL;
+}
+
+/*
+ * Return true if collation provider is nondefault and valid, and false otherwise.
+ */
+bool
+is_valid_nondefault_collprovider(char collprovider)
+{
+ return (collprovider == COLLPROVIDER_LIBC ||
+ collprovider == COLLPROVIDER_ICU);
+}
diff --git a/src/fe_utils/.gitignore b/src/fe_utils/.gitignore
index 37f5f75..b14041b 100644
--- a/src/fe_utils/.gitignore
+++ b/src/fe_utils/.gitignore
@@ -1 +1,2 @@
/psqlscan.c
+/pg_collation_fn_common.c
diff --git a/src/fe_utils/Makefile b/src/fe_utils/Makefile
index 5362cff..f6ffa09 100644
--- a/src/fe_utils/Makefile
+++ b/src/fe_utils/Makefile
@@ -19,7 +19,8 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
-OBJS = mbprint.o print.o psqlscan.o simple_list.o string_utils.o conditional.o
+OBJS = mbprint.o print.o psqlscan.o simple_list.o string_utils.o conditional.o \
+ pg_collation_fn_common.o
all: libpgfeutils.a
@@ -33,6 +34,13 @@ psqlscan.c: FLEX_FIX_WARNING=yes
distprep: psqlscan.c
+# Pull in pg_collation_fn_common.c from src/common. That exposes us to
+# risks of version skew if we link to a shared library. Do it the
+# hard way, instead, so that we're statically linked.
+
+pg_collation_fn_common.c: % : $(top_srcdir)/src/common/%
+ rm -f $@ && $(LN_S) $< .
+
# libpgfeutils could be useful to contrib, so install it
install: all installdirs
$(INSTALL_STLIB) libpgfeutils.a '$(DESTDIR)$(libdir)/libpgfeutils.a'
@@ -45,6 +53,7 @@ uninstall:
clean distclean:
rm -f libpgfeutils.a $(OBJS) lex.backup
+ rm -f pg_collation_fn_common.c
# psqlscan.c is supposed to be in the distribution tarball,
# so do not clean it in the clean/distclean rules
diff --git a/src/include/commands/dbcommands.h b/src/include/commands/dbcommands.h
index 677c7fc..d1b2776 100644
--- a/src/include/commands/dbcommands.h
+++ b/src/include/commands/dbcommands.h
@@ -29,6 +29,7 @@ extern ObjectAddress AlterDatabaseOwner(const char *dbname, Oid newOwnerId);
extern Oid get_database_oid(const char *dbname, bool missingok);
extern char *get_database_name(Oid dbid);
-extern void check_encoding_locale_matches(int encoding, const char *collate, const char *ctype);
+extern void check_encoding_locale_matches(int encoding, const char *collate, const char *ctype,
+ char collprovider);
#endif /* DBCOMMANDS_H */
diff --git a/src/include/common/pg_collation_fn_common.h b/src/include/common/pg_collation_fn_common.h
new file mode 100644
index 0000000..f05778d
--- /dev/null
+++ b/src/include/common/pg_collation_fn_common.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_collation_fn_common.h
+ * prototypes for functions in common/pg_collation_fn_common.c
+ *
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/pg_collation_fn_common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef PG_COLLATION_FN_COMMON_H
+#define PG_COLLATION_FN_COMMON_H
+
+extern char get_collprovider(const char *name);
+extern const char *get_collprovider_name(char collprovider);
+extern bool is_valid_nondefault_collprovider(char collprovider);
+
+#endif /* PG_COLLATION_FN_COMMON_H */
diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32
index ab276f7..1f27b58 100644
--- a/src/include/pg_config.h.win32
+++ b/src/include/pg_config.h.win32
@@ -693,6 +693,10 @@
/* Define to use /dev/urandom for random number generation */
/* #undef USE_DEV_URANDOM */
+/* Define to build with ICU support. (--with-icu) */
+/* #undef USE_ICU */
+
+
/* Define to 1 to build with LDAP support. (--with-ldap) */
/* #undef USE_LDAP */
diff --git a/src/include/port.h b/src/include/port.h
index 74a9dc4..02cb8fc 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -432,6 +432,40 @@ extern int pg_get_encoding_from_locale(const char *ctype, bool write_message);
extern int pg_codepage_to_encoding(UINT cp);
#endif
+/* do not make libpq with icu */
+#ifndef LIBPQ_MAKE
+
+extern void check_locale_collprovider(const char *locale, char **canonname,
+ char *collprovider, char **collversion);
+extern bool locale_is_c(const char *locale);
+extern char *get_full_collation_name(const char *locale, char collprovider,
+ const char *collversion);
+
+#ifdef FRONTEND
+extern void get_collation_actual_version(char collprovider,
+ const char *collcollate,
+ char **collversion, bool *failure);
+#else
+extern char *get_collation_actual_version(char collprovider,
+ const char *collcollate);
+#endif
+
+#ifdef USE_ICU
+#define ICU_ROOT_LOCALE "root"
+
+/* Users of this must import unicode/ucol.h too. */
+struct UCollator;
+extern struct UCollator *open_collator(const char *collate);
+
+extern char * get_icu_language_tag(const char *localename);
+extern const char *get_icu_collate(const char *locale, const char *langtag);
+#ifdef WIN32
+extern char * check_icu_winlocale(const char *winlocale);
+#endif /* WIN32 */
+#endif /* USE_ICU */
+
+#endif /* not LIBPQ_MAKE */
+
/* port/inet_net_ntop.c */
extern char *inet_net_ntop(int af, const void *src, int bits,
char *dst, size_t size);
diff --git a/src/include/port/win32.h b/src/include/port/win32.h
index 9f48a58..7e3e7e5 100644
--- a/src/include/port/win32.h
+++ b/src/include/port/win32.h
@@ -16,7 +16,7 @@
* get support for GetLocaleInfoEx() with locales. For everything else
* the minimum version is Windows XP (0x0501).
*/
-#if defined(_MSC_VER) && _MSC_VER >= 1900
+#if defined(_MSC_VER) && _MSC_VER >= 1800
#define MIN_WINNT 0x0600
#else
#define MIN_WINNT 0x0501
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 88a3134..161a14e 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -57,8 +57,10 @@ extern void assign_locale_numeric(const char *newval, void *extra);
extern bool check_locale_time(char **newval, void **extra, GucSource source);
extern void assign_locale_time(const char *newval, void *extra);
-extern bool check_locale(int category, const char *locale, char **canonname);
-extern char *pg_perm_setlocale(int category, const char *locale);
+extern bool check_locale(int category, const char *locale, char **canonname,
+ char collprovider);
+extern const char *pg_perm_setlocale(int category, const char *locale,
+ char collprovider);
extern void check_strxfrm_bug(void);
extern bool lc_collate_is_c(Oid collation);
@@ -102,11 +104,11 @@ typedef struct pg_locale_struct *pg_locale_t;
extern pg_locale_t pg_newlocale_from_collation(Oid collid);
-extern char *get_collation_actual_version(char collprovider, const char *collcollate);
-
#ifdef USE_ICU
extern int32_t icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes);
extern int32_t icu_from_uchar(char **result, const UChar *buff_uchar, int32_t len_uchar);
+extern const char *get_icu_default_collate(void);
+extern UCollator *get_default_collation_collator(void);
#endif
/* These functions convert from/to libc's wchar_t, *not* pg_wchar_t */
@@ -115,4 +117,6 @@ extern size_t wchar2char(char *to, const wchar_t *from, size_t tolen,
extern size_t char2wchar(wchar_t *to, size_t tolen,
const char *from, size_t fromlen, pg_locale_t locale);
+extern char get_default_collprovider(void);
+
#endif /* _PG_LOCALE_ */
diff --git a/src/interfaces/libpq/.gitignore b/src/interfaces/libpq/.gitignore
index 5c232ae..212edd9 100644
--- a/src/interfaces/libpq/.gitignore
+++ b/src/interfaces/libpq/.gitignore
@@ -32,3 +32,4 @@
/unicode_norm.c
/encnames.c
/wchar.c
+/pg_collation_fn_common.c
diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile
index abe0a50..32a5d43 100644
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -19,7 +19,7 @@ NAME= pq
SO_MAJOR_VERSION= 5
SO_MINOR_VERSION= $(MAJORVERSION)
-override CPPFLAGS := -DFRONTEND -DUNSAFE_STAT_OK -I$(srcdir) $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port
+override CPPFLAGS := -DFRONTEND -DUNSAFE_STAT_OK -I$(srcdir) $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port -DLIBPQ_MAKE
ifneq ($(PORTNAME), win32)
override CFLAGS += $(PTHREAD_CFLAGS)
endif
diff --git a/src/port/chklocale.c b/src/port/chklocale.c
index dde9130..a30bded 100644
--- a/src/port/chklocale.c
+++ b/src/port/chklocale.c
@@ -23,8 +23,26 @@
#include <langinfo.h>
#endif
+#ifdef USE_ICU
+#include <unicode/ucol.h>
+#endif
+
+#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
+/*
+ * In backend, we will use palloc/pfree. In frontend, use malloc/free.
+ */
+#ifndef FRONTEND
+#define STRDUP(s) pstrdup(s)
+#define ALLOC(size) palloc(size)
+#define FREE(s) pfree(s)
+#else
+#define STRDUP(s) strdup(s)
+#define ALLOC(size) malloc(size)
+#define FREE(s) free(s)
+#endif
/*
* This table needs to recognize all the CODESET spellings for supported
@@ -436,3 +454,583 @@ pg_get_encoding_from_locale(const char *ctype, bool write_message)
}
#endif /* (HAVE_LANGINFO_H && CODESET) || WIN32 */
+
+/* do not make libpq with icu */
+#ifndef LIBPQ_MAKE
+
+/*
+ * Check if the locale contains the modifier of the collation provider.
+ *
+ * Set up the collation provider according to the appropriate modifier or '\0'.
+ * Set up the collation version to NULL if we don't find it after the collation
+ * provider modifier.
+ *
+ * The malloc'd copy of the locale's canonical name without the modifier of the
+ * collation provider and the collation version is stored in the canonname if
+ * locale is not NULL. The canoname can have the zero length.
+ */
+void
+check_locale_collprovider(const char *locale, char **canonname,
+ char *collprovider, char **collversion)
+{
+ const char *modifier_sign,
+ *dot_sign,
+ *cur_collprovider_end;
+ char cur_collprovider_name[NAMEDATALEN];
+ int cur_collprovider_len;
+ char cur_collprovider;
+
+ /* in case of failure or if we don't find them in the locale name */
+ if (canonname)
+ *canonname = NULL;
+ if (collprovider)
+ *collprovider = '\0';
+ if (collversion)
+ *collversion = NULL;
+
+ if (!locale)
+ return;
+
+ /* find the last occurrence of the modifier sign '@' in the locale */
+ modifier_sign = strrchr(locale, '@');
+
+ if (!modifier_sign)
+ {
+ /* just copy all the name */
+ if (canonname)
+ *canonname = STRDUP(locale);
+ return;
+ }
+
+ /* check if there's a version after the collation provider modifier */
+ if ((dot_sign = strchr(modifier_sign, '.')) == NULL)
+ cur_collprovider_end = &locale[strlen(locale)];
+ else
+ cur_collprovider_end = dot_sign;
+
+ cur_collprovider_len = cur_collprovider_end - modifier_sign - 1;
+ if (cur_collprovider_len + 1 > NAMEDATALEN)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("collation provider name is too long: %s"), locale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("collation provider name is too long: %s", locale)));
+#endif /* not FRONTEND */
+ return;
+ }
+
+ strncpy(cur_collprovider_name, modifier_sign + 1, cur_collprovider_len);
+ cur_collprovider_name[cur_collprovider_len] = '\0';
+
+ /* check if this is a valid collprovider name */
+ cur_collprovider = get_collprovider(cur_collprovider_name);
+ if (is_valid_nondefault_collprovider(cur_collprovider))
+ {
+ if (collprovider)
+ *collprovider = cur_collprovider;
+
+ if (canonname)
+ {
+ int canonname_len = modifier_sign - locale;
+
+ *canonname = ALLOC((canonname_len + 1) * sizeof(char));
+ if (*canonname)
+ {
+ strncpy(*canonname, locale, canonname_len);
+ (*canonname)[canonname_len] = '\0';
+ }
+ else
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("out of memory"));
+ /*
+ * keep newline separate so there's only one translatable string
+ */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR, (errmsg("out of memory")));
+#endif /* not FRONTEND */
+ }
+ }
+
+ if (dot_sign && collversion)
+ *collversion = STRDUP(dot_sign + 1);
+ }
+ else
+ {
+ /* just copy all the name */
+ if (canonname)
+ *canonname = STRDUP(locale);
+ }
+}
+
+/*
+ * Return true if locale is "C" or "POSIX";
+ */
+bool
+locale_is_c(const char *locale)
+{
+ return locale && (strcmp(locale, "C") == 0 || strcmp(locale, "POSIX") == 0);
+}
+
+/*
+ * Return locale ended with collation provider modifier and collation version.
+ *
+ * Return NULL if locale is NULL.
+ */
+char *
+get_full_collation_name(const char *locale, char collprovider,
+ const char *collversion)
+{
+ char *new_locale;
+ int old_len,
+ len_with_provider,
+ new_len;
+ const char *collprovider_name;
+
+ if (!locale)
+ return NULL;
+
+ collprovider_name = get_collprovider_name(collprovider);
+ Assert(collprovider_name);
+
+ old_len = strlen(locale);
+ new_len = len_with_provider = old_len + 1 + strlen(collprovider_name);
+ if (collversion && *collversion)
+ new_len += 1 + strlen(collversion);
+
+ new_locale = ALLOC((new_len + 1) * sizeof(char));
+ if (!new_locale)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("out of memory"));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR, (errmsg("out of memory")));
+#endif /* not FRONTEND */
+
+ return NULL;
+ }
+
+ /* add the collation provider modifier */
+ strcpy(new_locale, locale);
+ new_locale[old_len] = '@';
+ strcpy(&new_locale[old_len + 1], collprovider_name);
+
+ /* add the collation version if needed */
+ if (collversion && *collversion)
+ {
+ new_locale[len_with_provider] = '.';
+ strcpy(&new_locale[len_with_provider + 1], collversion);
+ }
+
+ new_locale[new_len] = '\0';
+
+ return new_locale;
+}
+
+/*
+ * Get provider-specific collation version string for the given collation from
+ * the operating system/library.
+ *
+ * A particular provider must always either return a non-NULL string or return
+ * NULL (if it doesn't support versions). It must not return NULL for some
+ * collcollate and not NULL for others.
+ */
+#ifdef FRONTEND
+void
+get_collation_actual_version(char collprovider, const char *collcollate,
+ char **collversion, bool *failure)
+{
+ if (failure)
+ *failure = false;
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ UCollator *collator = open_collator(collcollate);
+ UVersionInfo versioninfo;
+ char buf[U_MAX_VERSION_STRING_LENGTH];
+
+ if (collator)
+ {
+ ucol_getVersion(collator, versioninfo);
+ ucol_close(collator);
+
+ u_versionToString(versioninfo, buf);
+ if (collversion)
+ *collversion = STRDUP(buf);
+ }
+ else
+ {
+ if (collversion)
+ *collversion = NULL;
+ if (failure)
+ *failure = true;
+ }
+ }
+ else
+#endif
+ {
+ if (collversion)
+ *collversion = NULL;
+ }
+}
+#else /* not FRONTEND */
+char *
+get_collation_actual_version(char collprovider, const char *collcollate)
+{
+ char *collversion;
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ UCollator *collator = open_collator(collcollate);
+ UVersionInfo versioninfo;
+ char buf[U_MAX_VERSION_STRING_LENGTH];
+
+ ucol_getVersion(collator, versioninfo);
+ ucol_close(collator);
+
+ u_versionToString(versioninfo, buf);
+ collversion = STRDUP(buf);
+ }
+ else
+#endif
+ collversion = NULL;
+
+ return collversion;
+}
+#endif /* not FRONTEND */
+
+#ifdef USE_ICU
+/*
+ * Open the collator for this icu locale. Return NULL in case of failure.
+ */
+UCollator *
+open_collator(const char *collate)
+{
+ UCollator *collator;
+ UErrorCode status;
+ const char *save = uloc_getDefault();
+ char *save_dup;
+
+ if (!save)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: uloc_getDefault() failed"));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR, (errmsg("ICU error: uloc_getDefault() failed")));
+#endif
+ return NULL;
+ }
+
+ /* save may be pointing at a modifiable scratch variable, so copy it. */
+ save_dup = STRDUP(save);
+
+ /* set the default locale to root */
+ status = U_ZERO_ERROR;
+ uloc_setDefault(ICU_ROOT_LOCALE, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: failed to set the default locale to \"%s\": %s"),
+ ICU_ROOT_LOCALE, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to set the default locale to \"%s\": %s",
+ ICU_ROOT_LOCALE, u_errorName(status))));
+#endif
+ return NULL;
+ }
+
+ /* get a collator for this collate */
+ status = U_ZERO_ERROR;
+ collator = ucol_open(collate, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: could not open collator for locale \"%s\": %s"),
+ collate, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: could not open collator for locale \"%s\": %s",
+ collate, u_errorName(status))));
+#endif
+ collator = NULL;
+ }
+
+ /* restore old value of the default locale. */
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: failed to restore old locale \"%s\": %s"),
+ save_dup, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to restore old locale \"%s\": %s",
+ save_dup, u_errorName(status))));
+#endif
+ return NULL;
+ }
+ FREE(save_dup);
+
+ return collator;
+}
+
+/*
+ * Get the ICU language tag for a locale name.
+ * The result is a palloc'd string.
+ * Return NULL in case of failure or if localename is NULL.
+ */
+char *
+get_icu_language_tag(const char *localename)
+{
+ char buf[ULOC_FULLNAME_CAPACITY];
+ UErrorCode status = U_ZERO_ERROR;
+
+ if (!localename)
+ return NULL;
+
+ uloc_toLanguageTag(localename, buf, sizeof(buf), TRUE, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("ICU error: could not convert locale name \"%s\" to language tag: %s"),
+ localename, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: could not convert locale name \"%s\" to language tag: %s",
+ localename, u_errorName(status))));
+#endif
+ return NULL;
+ }
+ return STRDUP(buf);
+}
+
+/*
+ * Get the icu collation name.
+ */
+const char *
+get_icu_collate(const char *locale, const char *langtag)
+{
+ return U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : locale;
+}
+
+#ifdef WIN32
+/*
+ * Get the Language Code Identifier (LCID) for the Windows locale.
+ *
+ * Return zero in case of failure.
+ */
+static uint32
+get_lcid(const wchar_t *winlocale)
+{
+ /*
+ * The second argument to the LocaleNameToLCID function is:
+ * - Prior to Windows 7: reserved; should always be 0.
+ * - Beginning in Windows 7: use LOCALE_ALLOW_NEUTRAL_NAMES to allow the
+ * return of lcids of locales without regions.
+ */
+#if (NTDDI_VERSION >= NTDDI_WIN7)
+ return LocaleNameToLCID(winlocale, LOCALE_ALLOW_NEUTRAL_NAMES);
+#else
+ return LocaleNameToLCID(winlocale, 0);
+#endif
+}
+
+/*
+ * char2wchar_ascii --- convert multibyte characters to wide characters
+ *
+ * This is a simplified version of the char2wchar() function from backend.
+ */
+static size_t
+char2wchar_ascii(wchar_t *to, size_t tolen, const char *from, size_t fromlen)
+{
+ size_t result;
+
+ if (tolen == 0)
+ return 0;
+
+ /* Win32 API does not work for zero-length input */
+ if (fromlen == 0)
+ result = 0;
+ else
+ {
+ result = MultiByteToWideChar(CP_ACP, 0, from, fromlen, to, tolen - 1);
+ /* A zero return is failure */
+ if (result == 0)
+ result = -1;
+ }
+
+ if (result != -1)
+ {
+ Assert(result < tolen);
+ /* Append trailing null wchar (MultiByteToWideChar() does not) */
+ to[result] = 0;
+ }
+
+ return result;
+}
+
+/*
+ * Get the canonical ICU name for the Windows locale.
+ *
+ * Return a malloc'd string or NULL in case of failure.
+ */
+char *
+check_icu_winlocale(const char *winlocale)
+{
+ uint32 lcid;
+ char canonname_buf[ULOC_FULLNAME_CAPACITY];
+ UErrorCode status = U_ZERO_ERROR;
+#if (_MSC_VER >= 1400) /* VC8.0 or later */
+ _locale_t loct = NULL;
+#endif
+
+ if (winlocale == NULL)
+ return NULL;
+
+ /* Get the Language Code Identifier (LCID). */
+
+#if (_MSC_VER >= 1400) /* VC8.0 or later */
+ loct = _create_locale(LC_COLLATE, winlocale);
+
+ if (loct != NULL)
+ {
+#if (_MSC_VER >= 1700) /* Visual Studio 2012 or later */
+ if ((lcid = get_lcid(loct->locinfo->locale_name[LC_COLLATE])) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif /* not FRONTEND */
+ _free_locale(loct);
+ return NULL;
+ }
+#else /* _MSC_VER >= 1400 && _MSC_VER < 1700 */
+ if ((lcid = loct->locinfo->lc_handle[LC_COLLATE]) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif /* not FRONTEND */
+ _free_locale(loct);
+ return NULL;
+ }
+#endif /* _MSC_VER >= 1400 && _MSC_VER < 1700 */
+ _free_locale(loct);
+ }
+ else
+#endif /* VC8.0 or later */
+ {
+ if (strlen(winlocale) == 0)
+ {
+ lcid = LOCALE_USER_DEFAULT;
+ }
+ else
+ {
+ size_t locale_len = strlen(winlocale);
+ wchar_t *wlocale = (wchar_t*) ALLOC(
+ (locale_len + 1) * sizeof(wchar_t));
+ /* Locale names use only ASCII */
+ size_t locale_wlen = char2wchar_ascii(wlocale, locale_len + 1,
+ winlocale, locale_len);
+ if (locale_wlen == -1)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to convert locale \"%s\" to wide characters"),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("failed to convert locale \"%s\" to wide characters",
+ winlocale)));
+#endif
+ FREE(wlocale);
+ return NULL;
+ }
+
+ if ((lcid = get_lcid(wlocale)) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif
+ FREE(wlocale);
+ return NULL;
+ }
+
+ FREE(wlocale);
+ }
+ }
+
+ /* Get the ICU canoname. */
+
+ uloc_getLocaleForLCID(lcid, canonname_buf, sizeof(canonname_buf), &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("ICU error: failed to get the locale name for LCID 0x%04x: %s"),
+ lcid, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to get the locale name for LCID 0x%04x: %s",
+ lcid, u_errorName(status))));
+#endif
+ return NULL;
+ }
+
+ return STRDUP(canonname_buf);
+}
+#endif /* WIN32 */
+#endif /* USE_ICU */
+
+#endif /* not LIBPQ_MAKE */
diff --git a/src/test/Makefile b/src/test/Makefile
index efb206a..d74c615 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
top_builddir = ../..
include $(top_builddir)/src/Makefile.global
-SUBDIRS = perl regress isolation modules authentication recovery subscription
+SUBDIRS = perl regress isolation modules authentication recovery subscription default_collation
# Test suites that are not safe by default but can be run if selected
# by the user via the whitespace-separated list in variable
diff --git a/src/test/default_collation/Makefile b/src/test/default_collation/Makefile
new file mode 100644
index 0000000..2efe8be
--- /dev/null
+++ b/src/test/default_collation/Makefile
@@ -0,0 +1,28 @@
+# src/test/default_collation/Makefile
+
+subdir = src/test/default_collation
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+ifeq ($(with_icu),yes)
+check:
+ $(MAKE) -C icu check
+check-utf8:
+ $(MAKE) -C icu.utf8 check
+ $(MAKE) -C libc.utf8 check
+else
+check:
+ $(MAKE) -C libc check
+check-utf8:
+ $(MAKE) -C libc.utf8 check
+endif
+
+# We don't check libc/ if with_icu or vice versa, but we do want "make clean" to
+# recurse into it. The same goes for libc.utf8/ or icu.utf8/, which we don't
+# check by default.
+ALWAYS_SUBDIRS = libc libc.utf8 icu icu.utf8
+
+clean distclean maintainer-clean:
+ for d in $(ALWAYS_SUBDIRS); do \
+ $(MAKE) -C $$d clean || exit; \
+ done
diff --git a/src/test/default_collation/icu.utf8/.gitignore b/src/test/default_collation/icu.utf8/.gitignore
new file mode 100644
index 0000000..871e943
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/icu.utf8/Makefile b/src/test/default_collation/icu.utf8/Makefile
new file mode 100644
index 0000000..7adecfd
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/icu.utf8/Makefile
+
+subdir = src/test/default_collation/icu.utf8
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/icu.utf8/t/001_default_collation.pl b/src/test/default_collation/icu.utf8/t/001_default_collation.pl
new file mode 100644
index 0000000..617c06d
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/t/001_default_collation.pl
@@ -0,0 +1,799 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 188;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"$expected_collprovider\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+sub psql
+{
+ my ($command, $db) = @_;
+ my ($result, $in, $out, $err);
+ my @psql = ('psql', '-X', '-c', $command);
+ if (defined($db))
+ {
+ push(@psql, $db);
+ }
+ print "# Running: " . join(" ", @psql) . "\n";
+ $result = IPC::Run::run \@psql, \$in, \$out, \$err;
+ ($result, $out, $err);
+}
+
+# --locale
+
+test_initdb(
+ "en_US.utf8 locale",
+ "--locale=en_US.utf8",
+ "icu",
+ "");
+
+test_initdb(
+ "en_US.utf8 locale with C ctype",
+ "--locale=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_initdb(
+ "be_BY\@latin icu locale",
+ "--locale=be_BY\@latin\@icu",
+ "icu",
+ "");
+
+test_initdb(
+ "be_BY\@latin icu locale invalid modifier order",
+ "--locale=be_BY\@icu\@latin",
+ "",
+ "invalid locale name \"be_BY\@icu\@latin\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_initdb(
+ "en_US.utf8 lc_collate",
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8",
+ "icu",
+ "");
+
+test_initdb(
+ "en_US.utf8 lc_collate with C ctype",
+ "--lc-collate=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_initdb(
+ "be_BY\@latin icu lc_collate",
+ "--lc-collate=be_BY\@latin\@icu --lc-ctype=be_BY\@latin",
+ "icu",
+ "");
+
+test_initdb(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@icu\@latin",
+ "",
+ "invalid locale name \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb",
+ split(" ", $options),
+ "--template=template0",
+ "mydb");
+
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name,
+ $options,
+ $expected_collprovider,
+ $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ ($result, $out_command, $err_command) = psql(
+ "create database mydb "
+ . $options
+ . " template = template0;");
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_default_collation
+{
+ my ($createdb_options, $collation, $expected_collprovider, @commands) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb", split(" ", $createdb_options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "\"@command\" check output");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "\"@command\" check output");
+ }
+
+ for (my $row = 0; $row <= $#commands; $row++)
+ {
+ my ($command_text, $expected) = @{$commands[$row]};
+ ($result, $out_command, $err_command) = psql($command_text, "mydb");
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ if ($out_command)
+ {
+ is(
+ $out_command,
+ $expected,
+ "default collation "
+ . $collation
+ . ": \""
+ . $command_text
+ . "\" check output");
+ }
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+my @command = ("createuser --createdb --no-superuser non_superuser");
+print "# Running: " . join(" ", @command) . "\n";
+system(@command);
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "en_US.utf8 locale",
+ "--locale=en_US.utf8",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu locale",
+ "--locale=be_BY\@latin\@icu",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu locale invalid modifier order",
+ "--locale=be_BY\@icu\@latin",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_createdb(
+ "en_US.utf8 lc_collate",
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8",
+ "icu",
+ "");
+
+test_createdb(
+ "en_US.utf8 lc_collate with C ctype",
+ "--lc-collate=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_createdb(
+ "be_BY\@latin icu lc_collate",
+ "--lc-collate=be_BY\@latin\@icu --lc-ctype=be_BY\@latin",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@icu\@latin",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+test_create_database(
+ "en_US.utf8 lc_collate",
+ "LC_COLLATE = 'en_US.utf8' LC_CTYPE = 'en_US.utf8'",
+ "icu",
+ "");
+
+test_create_database(
+ "en_US.utf8 lc_collate with C ctype",
+ "LC_COLLATE = 'en_US.utf8' LC_CTYPE = 'C'",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_create_database(
+ "be_BY\@latin icu lc_collate",
+ "LC_COLLATE = 'be_BY\@latin' LC_CTYPE = 'be_BY\@latin'",
+ "icu",
+ "");
+
+test_create_database(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "LC_COLLATE = 'be_BY\@icu\@latin'",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test default collation behaviour
+# use commands and outputs from the regression test collate.icu.utf8
+
+test_default_collation(
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8 --template=template0",
+ "en_US.utf8\@icu",
+ "icu",
+ (
+ [
+ "CREATE TABLE collate_test1 (a int, b text NOT NULL);",
+ "CREATE TABLE\n"
+ ],
+ [
+ "INSERT INTO collate_test1 VALUES "
+ . "(1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');",
+ "INSERT 0 4\n"],
+ [
+ "SELECT * FROM collate_test1 WHERE b >= 'bbc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 3 | bbc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # star expansion
+ [
+ "SELECT * FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # upper/lower
+ ["CREATE TABLE collate_test10 (a int, x text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test10 VALUES (1, 'hij'), (2, 'HIJ');",
+ "INSERT 0 2\n"
+ ],
+ [
+ "SELECT a, lower(x), upper(x), initcap(x) FROM collate_test10;",
+ " a | lower | upper | initcap \n"
+ . "---+-------+-------+---------\n"
+ . " 1 | hij | HIJ | Hij\n"
+ . " 2 | hij | HIJ | Hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ # LIKE/ILIKE
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ILIKE '%KI%' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ILIKE 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ # regular expressions
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE TABLE collate_test6 (a int, b text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test6 VALUES "
+ . "(1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'), "
+ . "(5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, ' '), "
+ . "(9, 'äbç'), (10, 'ÄBÇ');",
+ "INSERT 0 10\n"
+ ],
+ [
+ "SELECT b, "
+ . "b ~ '^[[:alpha:]]+\$' AS is_alpha, "
+ . "b ~ '^[[:upper:]]+\$' AS is_upper, "
+ . "b ~ '^[[:lower:]]+\$' AS is_lower, "
+ . "b ~ '^[[:digit:]]+\$' AS is_digit, "
+ . "b ~ '^[[:alnum:]]+\$' AS is_alnum, "
+ . "b ~ '^[[:graph:]]+\$' AS is_graph, "
+ . "b ~ '^[[:print:]]+\$' AS is_print, "
+ . "b ~ '^[[:punct:]]+\$' AS is_punct, "
+ . "b ~ '^[[:space:]]+\$' AS is_space "
+ . "FROM collate_test6;",
+ " b | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space \n"
+ . "-----+----------+----------+----------+----------+----------+----------+----------+----------+----------\n"
+ . " abc | t | f | t | f | t | t | t | f | f\n"
+ . " ABC | t | t | f | f | t | t | t | f | f\n"
+ . " 123 | f | f | f | t | t | t | t | f | f\n"
+ . " ab1 | f | f | f | f | t | t | t | f | f\n"
+ . " a1! | f | f | f | f | f | t | t | f | f\n"
+ . " a c | f | f | f | f | f | f | t | f | f\n"
+ . " !.; | f | f | f | f | f | t | t | t | f\n"
+ . " | f | f | f | f | f | f | t | f | t\n"
+ . " äbç | t | f | t | f | t | t | t | f | f\n"
+ . " ÄBÇ | t | t | f | f | t | t | t | f | f\n"
+ . "(10 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ~* 'KI' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ~* 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(coalesce(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;",
+ " a | b | greatest \n"
+ . "---+-----+----------\n"
+ . " 1 | abc | CCC\n"
+ . " 2 | äbc | CCC\n"
+ . " 3 | bbc | CCC\n"
+ . " 4 | ABC | CCC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, x, lower(greatest(x, 'foo')) FROM collate_test10;",
+ " a | x | lower \n"
+ . "---+-----+-------\n"
+ . " 1 | hij | hij\n"
+ . " 2 | HIJ | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;",
+ " a | nullif \n"
+ . "---+--------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 1 | \n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(nullif(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END "
+ . "FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 1 | abcd\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE DOMAIN testdomain AS text;", "CREATE DOMAIN\n", ""],
+ [
+ "SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(x::testdomain) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT min(b), max(b) FROM collate_test1;",
+ " min | max \n"
+ . "-----+-----\n"
+ . " abc | bbc\n"
+ . "(1 row)\n"
+ . "\n",
+ ""
+ ],
+ [
+ "SELECT array_agg(b ORDER BY b) FROM collate_test1;",
+ " array_agg \n"
+ . "-------------------\n"
+ . " {abc,ABC,äbc,bbc}\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 "
+ . "UNION ALL "
+ . "SELECT a, b FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 3 | bbc\n"
+ . "(8 rows)\n"
+ . "\n"
+ ],
+ # casting
+ [
+ "SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # propagation of collation in SQL functions (inlined and non-inlined
+ # cases) and plpgsql functions too
+ [
+ "CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_noninline (text, text) "
+ . "RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 limit 1 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_plpgsql (text, text) "
+ . "RETURNS boolean LANGUAGE plpgsql "
+ . "AS \$\$ begin return \$1 < \$2; end \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a.b AS a, b.b AS b, a.b < b.b AS lt, "
+ . "mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b) "
+ . "FROM collate_test1 a, collate_test1 b "
+ . "ORDER BY a.b, b.b;",
+ " a | b | lt | mylt | mylt_noninline | mylt_plpgsql \n"
+ . "-----+-----+----+------+----------------+--------------\n"
+ . " abc | abc | f | f | f | f\n"
+ . " abc | ABC | t | t | t | t\n"
+ . " abc | äbc | t | t | t | t\n"
+ . " abc | bbc | t | t | t | t\n"
+ . " ABC | abc | f | f | f | f\n"
+ . " ABC | ABC | f | f | f | f\n"
+ . " ABC | äbc | t | t | t | t\n"
+ . " ABC | bbc | t | t | t | t\n"
+ . " äbc | abc | f | f | f | f\n"
+ . " äbc | ABC | f | f | f | f\n"
+ . " äbc | äbc | f | f | f | f\n"
+ . " äbc | bbc | t | t | t | t\n"
+ . " bbc | abc | f | f | f | f\n"
+ . " bbc | ABC | f | f | f | f\n"
+ . " bbc | äbc | f | f | f | f\n"
+ . " bbc | bbc | f | f | f | f\n"
+ . "(16 rows)\n"
+ . "\n"
+ ],
+ # polymorphism
+ [
+ "SELECT * FROM unnest("
+ . "(SELECT array_agg(b ORDER BY b) FROM collate_test1)"
+ . ") ORDER BY 1;",
+ " unnest \n"
+ . "--------\n"
+ . " abc\n"
+ . " ABC\n"
+ . " äbc\n"
+ . " bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "CREATE FUNCTION dup (anyelement) RETURNS anyelement "
+ . "AS 'select \$1' LANGUAGE sql;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a, dup(b) FROM collate_test1 ORDER BY 2;",
+ " a | dup \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # indexes
+ [
+ "CREATE INDEX collate_test1_idx1 ON collate_test1 (b);",
+ "CREATE INDEX\n"
+ ]
+ )
+);
+
+$node->stop;
diff --git a/src/test/default_collation/icu/.gitignore b/src/test/default_collation/icu/.gitignore
new file mode 100644
index 0000000..871e943
--- /dev/null
+++ b/src/test/default_collation/icu/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/icu/Makefile b/src/test/default_collation/icu/Makefile
new file mode 100644
index 0000000..5ee91d8
--- /dev/null
+++ b/src/test/default_collation/icu/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/icu/Makefile
+
+subdir = src/test/default_collation/icu
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/icu/t/001_default_collation.pl b/src/test/default_collation/icu/t/001_default_collation.pl
new file mode 100644
index 0000000..8b58be3
--- /dev/null
+++ b/src/test/default_collation/icu/t/001_default_collation.pl
@@ -0,0 +1,605 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# check whether ICU can convert C locale to a language tag
+
+my ($in_initdb, $out_initdb, $err_initdb);
+my @command = (qw(initdb -A trust -N -D), $datadir, "--locale=C\@icu");
+print "# Running: " . join(" ", @command) . "\n";
+my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb, \$err_initdb;
+
+my $c_to_icu_language_tag = (
+ not $err_initdb =~ /ICU error: could not convert locale name "C" to language tag: U_ILLEGAL_ARGUMENT_ERROR/);
+
+# get the number of tests
+
+plan tests => $c_to_icu_language_tag ? 124 : 110;
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"$expected_collprovider\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+# --locale
+
+test_initdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "libc",
+ "");
+
+test_initdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX libc locale",
+ "--locale=POSIX\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+test_initdb(
+ "POSIX icu locale",
+ "--locale=POSIX\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+test_initdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "",
+ "invalid locale name \"C\@icu\"");
+
+test_initdb(
+ "ICU language tag format locale",
+ "--locale=und-x-icu",
+ "",
+ "invalid locale name \"und-x-icu\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_initdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ "libc",
+ "");
+
+test_initdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX libc lc_collate",
+ "--lc-collate=POSIX\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu --lc-ctype=C",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));;
+
+test_initdb(
+ "POSIX icu lc_collate",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));;
+
+test_initdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "",
+ "invalid locale name \"C\@icu\"");
+
+test_initdb(
+ "ICU language tag format lc_collate",
+ "--lc-collate=und-x-icu",
+ "",
+ "invalid locale name \"und-x-icu\"");
+
+# --locale & --lc-collate
+
+test_initdb(
+ "lc_collate implicit provider takes precedence",
+ "--locale=\@icu --lc-collate=C",
+ "libc",
+ "");
+
+test_initdb(
+ "lc_collate explicit provider takes precedence",
+ "--locale=C\@libc --lc-collate=C\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb",
+ split(" ", $options),
+ "--template=template0",
+ "mydb");
+
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name,
+ $createdb_options,
+ $psql_options,
+ $expected_collprovider,
+ $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("psql",
+ split(" ", $psql_options),
+ "-c",
+ "create database mydb "
+ . $createdb_options
+ . " template = template0;");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+@command = ("createuser --createdb --no-superuser non_superuser");
+print "# Running: " . join(" ", @command) . "\n";
+system(@command);
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "libc",
+ "");
+
+test_createdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX libc locale",
+ "--locale=POSIX\@libc",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "C icu locale with SQL_ASCII encoding and superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "C icu locale with SQL_ASCII encoding and superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "C icu locale with SQL_ASCII encoding and non-superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII --username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and non-superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII --username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_createdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_createdb(
+ "ICU language tag format locale",
+ "--locale=und-x-icu",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_createdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C --lc-ctype=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX --lc-ctype=POSIX",
+ "libc",
+ "");
+
+test_createdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc --lc-ctype=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX libc lc_collate",
+ "--lc-collate=POSIX\@libc --lc-ctype=POSIX",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII "
+ . "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII",
+ "icu",
+ "");
+
+}
+else
+{
+ test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII "
+ . "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_createdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_createdb(
+ "ICU language tag format lc_collate",
+ "--lc-collate=und-x-icu",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+# test CREATE DATABASE
+
+# LC_COLLATE with the same LC_CTYPE if needed
+
+test_create_database(
+ "empty libc lc_collate",
+ "LC_COLLATE = '\@libc'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "C lc_collate without collation provider",
+ "LC_COLLATE = 'C' LC_CTYPE = 'C'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "POSIX lc_collate without collation provider",
+ "LC_COLLATE = 'POSIX' LC_CTYPE = 'POSIX'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "C libc lc_collate",
+ "LC_COLLATE = 'C\@libc' LC_CTYPE = 'C'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "POSIX libc lc_collate",
+ "LC_COLLATE = 'POSIX\@libc' LC_CTYPE = 'POSIX'",
+ "",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "",
+ "icu",
+ "");
+}
+else
+{
+ test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "",
+ "icu",
+ "");
+}
+else
+{
+ test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_create_database(
+ "C lc_collate too many modifiers",
+ "LC_COLLATE = 'C\@icu\@libc'",
+ "",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_create_database(
+ "ICU language tag format lc_collate",
+ "LC_COLLATE = 'und-x-icu'",
+ "",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+$node->stop;
diff --git a/src/test/default_collation/libc.utf8/.gitignore b/src/test/default_collation/libc.utf8/.gitignore
new file mode 100644
index 0000000..871e943
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/libc.utf8/Makefile b/src/test/default_collation/libc.utf8/Makefile
new file mode 100644
index 0000000..e5b9d20
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/libc.utf8/Makefile
+
+subdir = src/test/default_collation/libc.utf8
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/libc.utf8/t/001_default_collation.pl b/src/test/default_collation/libc.utf8/t/001_default_collation.pl
new file mode 100644
index 0000000..e4b3552
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/t/001_default_collation.pl
@@ -0,0 +1,703 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 168;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"libc\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+sub psql
+{
+ my ($command, $db) = @_;
+ my ($result, $in, $out, $err);
+ my @psql = ('psql', '-X', '-c', $command);
+ if (defined($db))
+ {
+ push(@psql, $db);
+ }
+ print "# Running: " . join(" ", @psql) . "\n";
+ $result = IPC::Run::run \@psql, \$in, \$out, \$err;
+ ($result, $out, $err);
+}
+
+# --locale
+
+test_initdb(
+ "be_BY\@latin libc locale",
+ "--locale=be_BY\@latin\@libc",
+ "");
+
+test_initdb(
+ "be_BY\@latin libc locale invalid modifier order",
+ "--locale=be_BY\@libc\@latin",
+ "invalid locale name \"be_BY\@libc\@latin\"");
+
+# --lc-collate
+
+test_initdb(
+ "be_BY\@latin libc lc_collate",
+ "--lc-collate=be_BY\@latin\@libc",
+ "");
+
+test_initdb(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@libc\@latin",
+ "invalid locale name \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test createdb, CREATE DATABASE and default collation behaviour
+
+sub test_createdb
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ if ($from_template0)
+ {
+ $options = $options . " --template=template0";
+ }
+
+ @command = ("createdb", split(" ", $options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command,
+ qr{\@libc\n},
+ "createdb: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ ($result, $out_command, $err_command) = psql(
+ "create database mydb "
+ . $options
+ . ($from_template0 ? " TEMPLATE = template0;" : ";"));
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command,
+ qr{\@libc\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_default_collation
+{
+ my ($createdb_options, $collation, @commands) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb", split(" ", $createdb_options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command, qr{\@libc\n}, "\"@command\" check output");
+
+ for (my $row = 0; $row <= $#commands; $row++)
+ {
+ my ($command_text, $expected) = @{$commands[$row]};
+ ($result, $out_command, $err_command) = psql($command_text, "mydb");
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ if ($out_command)
+ {
+ is(
+ $out_command,
+ $expected,
+ "default collation "
+ . $collation
+ . ": \""
+ . $command_text
+ . "\" check output");
+ }
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "be_BY\@latin libc locale",
+ "--locale=be_BY\@latin\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "be_BY\@latin libc locale invalid modifier order",
+ "--locale=be_BY\@libc\@latin",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\"");
+
+# --lc-collate
+
+test_createdb(
+ "be_BY\@latin libc lc_collate",
+ "--lc-collate=be_BY\@latin\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@libc\@latin",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+# LC_COLLATE
+
+test_create_database(
+ "be_BY\@latin libc lc_collate",
+ "LC_COLLATE = 'be_BY\@latin\@libc'",
+ 1,
+ "");
+
+test_create_database(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "LC_COLLATE = 'be_BY\@libc\@latin'",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test default collation behaviour
+# use commands and outputs from the regression test collate.linux.utf8
+
+test_default_collation(
+ "--lc-collate=en_US.utf8\@libc --template=template0",
+ "en_US.utf8\@libc",
+ (
+ [
+ "CREATE TABLE collate_test1 (a int, b text NOT NULL);",
+ "CREATE TABLE\n"
+ ],
+ [
+ "INSERT INTO collate_test1 VALUES "
+ . "(1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');",
+ "INSERT 0 4\n"],
+ [
+ "SELECT * FROM collate_test1 WHERE b >= 'bbc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 3 | bbc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # star expansion
+ [
+ "SELECT * FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # upper/lower
+ ["CREATE TABLE collate_test10 (a int, x text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test10 VALUES (1, 'hij'), (2, 'HIJ');",
+ "INSERT 0 2\n"
+ ],
+ [
+ "SELECT a, lower(x), upper(x), initcap(x) FROM collate_test10;",
+ " a | lower | upper | initcap \n"
+ . "---+-------+-------+---------\n"
+ . " 1 | hij | HIJ | Hij\n"
+ . " 2 | hij | HIJ | Hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ # LIKE/ILIKE
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ILIKE '%KI%' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ILIKE 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ # regular expressions
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE TABLE collate_test6 (a int, b text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test6 VALUES "
+ . "(1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'), "
+ . "(5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, ' '), "
+ . "(9, 'äbç'), (10, 'ÄBÇ');",
+ "INSERT 0 10\n"
+ ],
+ [
+ "SELECT b, "
+ . "b ~ '^[[:alpha:]]+\$' AS is_alpha, "
+ . "b ~ '^[[:upper:]]+\$' AS is_upper, "
+ . "b ~ '^[[:lower:]]+\$' AS is_lower, "
+ . "b ~ '^[[:digit:]]+\$' AS is_digit, "
+ . "b ~ '^[[:alnum:]]+\$' AS is_alnum, "
+ . "b ~ '^[[:graph:]]+\$' AS is_graph, "
+ . "b ~ '^[[:print:]]+\$' AS is_print, "
+ . "b ~ '^[[:punct:]]+\$' AS is_punct, "
+ . "b ~ '^[[:space:]]+\$' AS is_space "
+ . "FROM collate_test6;",
+ " b | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space \n"
+ . "-----+----------+----------+----------+----------+----------+----------+----------+----------+----------\n"
+ . " abc | t | f | t | f | t | t | t | f | f\n"
+ . " ABC | t | t | f | f | t | t | t | f | f\n"
+ . " 123 | f | f | f | t | t | t | t | f | f\n"
+ . " ab1 | f | f | f | f | t | t | t | f | f\n"
+ . " a1! | f | f | f | f | f | t | t | f | f\n"
+ . " a c | f | f | f | f | f | f | t | f | f\n"
+ . " !.; | f | f | f | f | f | t | t | t | f\n"
+ . " | f | f | f | f | f | f | t | f | t\n"
+ . " äbç | t | f | t | f | t | t | t | f | f\n"
+ . " ÄBÇ | t | t | f | f | t | t | t | f | f\n"
+ . "(10 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ~* 'KI' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ~* 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(coalesce(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;",
+ " a | b | greatest \n"
+ . "---+-----+----------\n"
+ . " 1 | abc | CCC\n"
+ . " 2 | äbc | CCC\n"
+ . " 3 | bbc | CCC\n"
+ . " 4 | ABC | CCC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, x, lower(greatest(x, 'foo')) FROM collate_test10;",
+ " a | x | lower \n"
+ . "---+-----+-------\n"
+ . " 1 | hij | hij\n"
+ . " 2 | HIJ | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;",
+ " a | nullif \n"
+ . "---+--------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 1 | \n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(nullif(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END "
+ . "FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 1 | abcd\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE DOMAIN testdomain AS text;", "CREATE DOMAIN\n", ""],
+ [
+ "SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(x::testdomain) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT min(b), max(b) FROM collate_test1;",
+ " min | max \n"
+ . "-----+-----\n"
+ . " abc | bbc\n"
+ . "(1 row)\n"
+ . "\n",
+ ""
+ ],
+ [
+ "SELECT array_agg(b ORDER BY b) FROM collate_test1;",
+ " array_agg \n"
+ . "-------------------\n"
+ . " {abc,ABC,äbc,bbc}\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 "
+ . "UNION ALL "
+ . "SELECT a, b FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 3 | bbc\n"
+ . "(8 rows)\n"
+ . "\n"
+ ],
+ # casting
+ [
+ "SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # propagation of collation in SQL functions (inlined and non-inlined
+ # cases) and plpgsql functions too
+ [
+ "CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_noninline (text, text) "
+ . "RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 limit 1 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_plpgsql (text, text) "
+ . "RETURNS boolean LANGUAGE plpgsql "
+ . "AS \$\$ begin return \$1 < \$2; end \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a.b AS a, b.b AS b, a.b < b.b AS lt, "
+ . "mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b) "
+ . "FROM collate_test1 a, collate_test1 b "
+ . "ORDER BY a.b, b.b;",
+ " a | b | lt | mylt | mylt_noninline | mylt_plpgsql \n"
+ . "-----+-----+----+------+----------------+--------------\n"
+ . " abc | abc | f | f | f | f\n"
+ . " abc | ABC | t | t | t | t\n"
+ . " abc | äbc | t | t | t | t\n"
+ . " abc | bbc | t | t | t | t\n"
+ . " ABC | abc | f | f | f | f\n"
+ . " ABC | ABC | f | f | f | f\n"
+ . " ABC | äbc | t | t | t | t\n"
+ . " ABC | bbc | t | t | t | t\n"
+ . " äbc | abc | f | f | f | f\n"
+ . " äbc | ABC | f | f | f | f\n"
+ . " äbc | äbc | f | f | f | f\n"
+ . " äbc | bbc | t | t | t | t\n"
+ . " bbc | abc | f | f | f | f\n"
+ . " bbc | ABC | f | f | f | f\n"
+ . " bbc | äbc | f | f | f | f\n"
+ . " bbc | bbc | f | f | f | f\n"
+ . "(16 rows)\n"
+ . "\n"
+ ],
+ # polymorphism
+ [
+ "SELECT * FROM unnest("
+ . "(SELECT array_agg(b ORDER BY b) FROM collate_test1)"
+ . ") ORDER BY 1;",
+ " unnest \n"
+ . "--------\n"
+ . " abc\n"
+ . " ABC\n"
+ . " äbc\n"
+ . " bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "CREATE FUNCTION dup (anyelement) RETURNS anyelement "
+ . "AS 'select \$1' LANGUAGE sql;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a, dup(b) FROM collate_test1 ORDER BY 2;",
+ " a | dup \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # indexes
+ [
+ "CREATE INDEX collate_test1_idx1 ON collate_test1 (b);",
+ "CREATE INDEX\n"
+ ]
+ )
+);
+
+$node->stop;
diff --git a/src/test/default_collation/libc/.gitignore b/src/test/default_collation/libc/.gitignore
new file mode 100644
index 0000000..871e943
--- /dev/null
+++ b/src/test/default_collation/libc/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/libc/Makefile b/src/test/default_collation/libc/Makefile
new file mode 100644
index 0000000..98ab736
--- /dev/null
+++ b/src/test/default_collation/libc/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/libc/Makefile
+
+subdir = src/test/default_collation/libc
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/libc/t/001_default_collation.pl b/src/test/default_collation/libc/t/001_default_collation.pl
new file mode 100644
index 0000000..bc8a6ad
--- /dev/null
+++ b/src/test/default_collation/libc/t/001_default_collation.pl
@@ -0,0 +1,355 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 90;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"libc\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+# empty locales
+
+test_initdb(
+ "empty locales",
+ "",
+ "");
+
+# --locale
+
+test_initdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "");
+
+test_initdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "");
+
+test_initdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "");
+
+test_initdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "");
+
+test_initdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ "ICU is not supported in this build");
+
+test_initdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "invalid locale name \"C\@icu\"");
+
+# --lc-collate
+
+test_initdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "");
+
+test_initdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ "");
+
+test_initdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ "");
+
+test_initdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ "");
+
+test_initdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu",
+ "ICU is not supported in this build");
+
+test_initdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "invalid locale name \"C\@icu\" \\(provider \"libc\"\\)");
+
+# --locale & --lc-collate
+
+test_initdb(
+ "lc_collate implicit provider takes precedence",
+ "--locale=\@icu --lc-collate=C",
+ "");
+
+test_initdb(
+ "lc_collate explicit provider takes precedence",
+ "--locale=\@icu --lc-collate=\@libc",
+ "");
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ if ($from_template0)
+ {
+ $options = $options . " --template=template0";
+ }
+
+ @command = ("createdb", split(" ", $options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ like($out_command,
+ qr{\@libc\n},
+ "createdb: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("psql",
+ "-c",
+ "create database mydb "
+ . $options
+ . ($from_template0 ? " template = template0" : "")
+ . ";");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ like($out_command,
+ qr{\@libc\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+# test createdb
+
+# empty locales
+
+test_createdb(
+ "empty locales",
+ "",
+ 0,
+ "");
+
+# --locale
+
+test_createdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ 0,
+ "");
+
+test_createdb(
+ "C locale without collation provider",
+ "--locale=C",
+ 1,
+ "");
+
+test_createdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ 1,
+ "");
+
+test_createdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ 1,
+ "ICU is not supported in this build");
+
+test_createdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ 1,
+ "invalid locale name: \"C\@icu\"");
+
+# --lc-collate
+
+test_createdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ 0,
+ "");
+
+test_createdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ 1,
+ "");
+test_createdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ 1,
+ "");
+
+test_createdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu",
+ 1,
+ "ICU is not supported in this build");
+
+test_createdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ 1,
+ "invalid locale name: \"C\@icu\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+# empty locales
+
+test_create_database(
+ "empty locales",
+ "",
+ 0,
+ "");
+
+# LC_COLLATE
+
+test_create_database(
+ "empty libc lc_collate",
+ "LC_COLLATE = '\@libc'",
+ 0,
+ "");
+
+test_create_database(
+ "C lc_collate without collation provider",
+ "LC_COLLATE = 'C'",
+ 1,
+ "");
+test_create_database(
+ "POSIX lc_collate without collation provider",
+ "LC_COLLATE = 'POSIX'",
+ 1,
+ "");
+
+test_create_database(
+ "C libc lc_collate",
+ "LC_COLLATE = 'C\@libc'",
+ 1,
+ "");
+
+test_create_database(
+ "C icu lc_collate",
+ "LC_COLLATE = 'C\@icu'",
+ 1,
+ "ICU is not supported in this build");
+
+test_create_database(
+ "C lc_collate too many modifiers",
+ "LC_COLLATE = 'C\@icu\@libc'",
+ 1,
+ "invalid locale name: \"C\@icu\" \\(provider \"libc\"\\)");
+
+$node->stop;
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index f485b5c..3fae21e 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -979,11 +979,14 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
-- schema manipulation commands
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -991,7 +994,7 @@ ERROR: collation "test0" already exists
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
@@ -1102,7 +1105,7 @@ drop type textrange_c;
drop type textrange_en_us;
-- cleanup
DROP SCHEMA collate_tests CASCADE;
-NOTICE: drop cascades to 18 other objects
+NOTICE: drop cascades to 19 other objects
DETAIL: drop cascades to table collate_test1
drop cascades to table collate_test_like
drop cascades to table collate_test2
@@ -1121,6 +1124,7 @@ drop cascades to function mylt_noninline(text,text)
drop cascades to function mylt_plpgsql(text,text)
drop cascades to function mylt2(text,text)
drop cascades to function dup(anyelement)
+drop cascades to function get_lc_collate(text)
RESET search_path;
-- leave a collation for pg_upgrade test
CREATE COLLATION coll_icu_upgrade FROM "und-x-icu";
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out
index 400a747..7aa8057 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.linux.utf8.out
@@ -988,11 +988,14 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
-- schema manipulation commands
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -1004,7 +1007,7 @@ NOTICE: collation "test0" for encoding "UTF8" already exists, skipping
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
@@ -1119,7 +1122,7 @@ drop type textrange_c;
drop type textrange_en_us;
-- cleanup
DROP SCHEMA collate_tests CASCADE;
-NOTICE: drop cascades to 18 other objects
+NOTICE: drop cascades to 19 other objects
DETAIL: drop cascades to table collate_test1
drop cascades to table collate_test_like
drop cascades to table collate_test2
@@ -1138,3 +1141,4 @@ drop cascades to function mylt_noninline(text,text)
drop cascades to function mylt_plpgsql(text,text)
drop cascades to function mylt2(text,text)
drop cascades to function dup(anyelement)
+drop cascades to function get_lc_collate(text)
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index ef39445..936d684 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -339,18 +339,22 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
+
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.linux.utf8.sql
index b51162e..e03ea1b 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.linux.utf8.sql
@@ -339,11 +339,15 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
+
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -352,7 +356,7 @@ CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 4543d87..969afd6 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -51,12 +51,19 @@ my @contrib_excludes = (
'snapshot_too_old');
# Set of variables for frontend modules
-my $frontend_defines = { 'initdb' => 'FRONTEND' };
+my $frontend_defines = {
+ 'initdb' => 'FRONTEND',
+ 'psql' => 'FRONTEND',
+ 'pg_dump' => 'FRONTEND',
+ 'pg_dumpall' => 'FRONTEND',
+ 'pg_restore' => 'FRONTEND',
+ };
my @frontend_uselibpq = ('pg_ctl', 'pg_upgrade', 'pgbench', 'psql', 'initdb');
my @frontend_uselibpgport = (
'pg_archivecleanup', 'pg_test_fsync',
'pg_test_timing', 'pg_upgrade',
'pg_waldump', 'pgbench');
+my @iculibs = ('icuin.lib', 'icuuc.lib');
my @frontend_uselibpgcommon = (
'pg_archivecleanup', 'pg_test_fsync',
'pg_test_timing', 'pg_upgrade',
@@ -65,8 +72,10 @@ my $frontend_extralibs = {
'initdb' => ['ws2_32.lib'],
'pg_restore' => ['ws2_32.lib'],
'pgbench' => ['ws2_32.lib'],
+ 'mchar' => [@iculibs],
'psql' => ['ws2_32.lib']
};
+my @frontend_iculibs = ('initdb', 'pg_upgrade');
my $frontend_extraincludes = {
'initdb' => ['src/timezone'],
'psql' => ['src/backend']
@@ -115,9 +124,9 @@ sub mkvcbuild
}
our @pgcommonallfiles = qw(
- base64.c config_info.c controldata_utils.c exec.c file_perm.c ip.c
- keywords.c md5.c pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c
- saslprep.c scram-common.c string.c unicode_norm.c username.c
+ base64.c config_info.c controldata_utils.c exec.c file_perm.c ip.c keywords.c
+ md5.c pg_collation_fn_common.c pg_lzcompress.c pgfnames.c psprintf.c relpath.c
+ rmtree.c saslprep.c scram-common.c string.c unicode_norm.c username.c
wait_error.c);
if ($solution->{options}->{openssl})
@@ -150,6 +159,7 @@ sub mkvcbuild
$libpgfeutils->AddDefine('FRONTEND');
$libpgfeutils->AddIncludeDir('src/interfaces/libpq');
$libpgfeutils->AddFiles('src/fe_utils', @pgfeutilsfiles);
+ $libpgfeutils->AddFile('src/common/pg_collation_fn_common.c');
$postgres = $solution->AddProject('postgres', 'exe', '', 'src/backend');
$postgres->AddIncludeDir('src/backend');
@@ -234,6 +244,7 @@ sub mkvcbuild
'src/interfaces/libpq');
$libpq->AddDefine('FRONTEND');
$libpq->AddDefine('UNSAFE_STAT_OK');
+ $libpq->AddDefine('LIBPQ_MAKE');
$libpq->AddIncludeDir('src/port');
$libpq->AddLibrary('secur32.lib');
$libpq->AddLibrary('ws2_32.lib');
@@ -242,6 +253,7 @@ sub mkvcbuild
$libpq->ReplaceFile('src/interfaces/libpq/libpqrc.c',
'src/interfaces/libpq/libpq.rc');
$libpq->AddReference($libpgport);
+ $libpq->AddFile('src/common/pg_collation_fn_common.c');
# The OBJS scraper doesn't know about ifdefs, so remove fe-secure-openssl.c
# and sha2_openssl.c if building without OpenSSL, and remove sha2.c if
@@ -426,6 +438,12 @@ sub mkvcbuild
{
push @contrib_excludes, 'uuid-ossp';
}
+ else
+ {
+ foreach my $fe (@frontend_iculibs) {
+ push @{$frontend_extralibs->{$fe}}, @iculibs;
+ }
+ }
# AddProject() does not recognize the constructs used to populate OBJS in
# the pgcrypto Makefile, so it will discover no files.
On Sat, Jun 24, 2017 at 10:55 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Fri, Jun 23, 2017 at 11:32 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:It's something I hope to address soon.
I hope you do. I think that we'd realize significant benefits by
having ICU become the defacto standard collation provider, that most
users get without even realizing it. As things stand, you have to make
a point of specifying an ICU collation as your per-column collation
within every CREATE TABLE. That's a significant barrier to adoption.1) Associate by name only. That is, you can create a database with any
COLLATION "foo" that you want, and it's only checked when you first
connect to or do anything in the database.2) Create shared collations. Then we'd need a way to manage having a
mix of shared and non-shared collations around.There are significant pros and cons to all of these ideas. Some people
I talked to appeared to prefer the shared collations approach.I strongly prefer the second approach. The only downside that occurs
to me is that that approach requires more code. Is there something
that I've missed?
Sorry to join this thread late. I was redirected here from another one[1]/messages/by-id/242e081c-aec8-a20a-510c-f4d0f183cebd@2ndquadrant.com.
I like the shared catalog idea, but here's one objection I thought
about: it makes it a bit harder to track whether you've sorted out all
your indexes after a version change. Say collation fr_CA's version
changes according to the provider, so that it no longer matches the
stored collversion. Now you'll need to be careful to connect to every
database in the cluster and run REINDEX, before you run ALTER
COLLATION "fr_CA" REFRESH VERSION to update the single shared
pg_collation row's collversion. With the non-shared pg_collation
scheme we have currently, you'd need to refresh the collation row in
each database after reindexing the whole database, which is IMHO a bit
nicer (you track which databases you've dealt with as you go through
them).
In other words, using a shared catalog moves the "scope" of the
version tracking even further away from the ideal scope, and requires
humans to actually get the cleanup right, and it's extra confusing
because you can only be connected to one database at a time so there
is no "REINDEX MY CLUSTER" and no possibility of making a command that
reindexes dependent indexes and then refreshes the collation version.
The ideal scope would be to track all referenced collation versions on
every index, and only update them at CREATE INDEX or REINDEX time
(also, as discussed in some other thread, CHECK constraints and
partition keys might be invalidated and should in theory also carry
versions that can only be updated by running a hypothetical RECHECK or
REPARTITION command). Then a shared pg_collation catalog would make
perfect sense, and there would be no need for it to have a collversion
column at all, or an ALTER COLLATION ... REFRESH VERSION command, and
therefore there would be no way to screw it up by REFRESHing the
VERSION without having really fixed the problem.
[1]: /messages/by-id/242e081c-aec8-a20a-510c-f4d0f183cebd@2ndquadrant.com
--
Thomas Munro
http://www.enterprisedb.com
On Wed, Sep 12, 2018 at 01:06:12PM +1200, Thomas Munro wrote:
The ideal scope would be to track all referenced collation versions on
every index, and only update them at CREATE INDEX or REINDEX time
(also, as discussed in some other thread, CHECK constraints and
partition keys might be invalidated and should in theory also carry
versions that can only be updated by running a hypothetical RECHECK or
REPARTITION command). Then a shared pg_collation catalog would make
perfect sense, and there would be no need for it to have a collversion
column at all, or an ALTER COLLATION ... REFRESH VERSION command, and
therefore there would be no way to screw it up by REFRESHing the
VERSION without having really fixed the problem.
Please note that the latest patch set does not apply, so this has been
switched to commit fest 2018-11, waiting on author for a rebase.
--
Michael
Hi!
2 окт. 2018 г., в 11:37, Michael Paquier <michael@paquier.xyz> написал(а):
Please note that the latest patch set does not apply, so this has been
switched to commit fest 2018-11, waiting on author for a rebase.
PFA rebased version. I've added LDFLAGS_INTERNAL += $(ICU_LIBS) in libpq, but I'm not entirely sure this is correct way to deal with complaints on ICU functions from libpq linking.
Best regards, Andrey Borodin.
Attachments:
0001-ICU-as-a-default-collation-provider-rebased-oct-2018.patchapplication/octet-stream; name=0001-ICU-as-a-default-collation-provider-rebased-oct-2018.patch; x-unix-mode=0644Download
From 4c246e3ca56540084395d3747acbf30ccb7d5db8 Mon Sep 17 00:00:00 2001
From: Andrey Borodin <amborodin@acm.org>
Date: Tue, 30 Oct 2018 10:50:38 +0500
Subject: [PATCH] ICU as a default collation provider rebased oct 2018
---
doc/src/sgml/charset.sgml | 55 ++
doc/src/sgml/ref/create_database.sgml | 8 +-
doc/src/sgml/ref/createdb.sgml | 18 +-
doc/src/sgml/ref/initdb.sgml | 9 +-
doc/src/sgml/regress.sgml | 17 +
src/backend/catalog/information_schema.sql | 2 +-
src/backend/commands/collationcmds.c | 33 +-
src/backend/commands/dbcommands.c | 152 +++-
src/backend/main/main.c | 5 +-
src/backend/regex/regc_pg_locale.c | 40 +-
src/backend/utils/adt/formatting.c | 111 ++-
src/backend/utils/adt/like.c | 16 +-
src/backend/utils/adt/pg_locale.c | 390 ++++++---
src/backend/utils/adt/selfuncs.c | 14 +-
src/backend/utils/adt/varlena.c | 270 +++---
src/backend/utils/init/postinit.c | 118 ++-
src/backend/utils/mb/encnames.c | 4 +-
src/bin/initdb/Makefile | 2 +-
src/bin/initdb/initdb.c | 386 ++++++++-
src/bin/pg_dump/pg_dump.c | 31 +-
src/bin/psql/describe.c | 10 +-
src/bin/scripts/Makefile | 2 +-
src/bin/scripts/createdb.c | 14 +-
src/common/Makefile | 2 +-
src/common/pg_collation_fn_common.c | 90 ++
src/fe_utils/.gitignore | 1 +
src/fe_utils/Makefile | 11 +-
src/include/commands/dbcommands.h | 3 +-
src/include/common/pg_collation_fn_common.h | 22 +
src/include/pg_config.h.win32 | 4 +
src/include/port.h | 34 +
src/include/port/win32.h | 2 +-
src/include/utils/pg_locale.h | 12 +-
src/interfaces/libpq/.gitignore | 1 +
src/interfaces/libpq/Makefile | 3 +-
src/port/chklocale.c | 598 +++++++++++++
src/test/Makefile | 2 +-
src/test/default_collation/Makefile | 28 +
.../default_collation/icu.utf8/.gitignore | 2 +
src/test/default_collation/icu.utf8/Makefile | 11 +
.../icu.utf8/t/001_default_collation.pl | 799 ++++++++++++++++++
src/test/default_collation/icu/.gitignore | 2 +
src/test/default_collation/icu/Makefile | 11 +
.../icu/t/001_default_collation.pl | 605 +++++++++++++
.../default_collation/libc.utf8/.gitignore | 2 +
src/test/default_collation/libc.utf8/Makefile | 11 +
.../libc.utf8/t/001_default_collation.pl | 703 +++++++++++++++
src/test/default_collation/libc/.gitignore | 2 +
src/test/default_collation/libc/Makefile | 11 +
.../libc/t/001_default_collation.pl | 355 ++++++++
.../regress/expected/collate.icu.utf8.out | 10 +-
.../regress/expected/collate.linux.utf8.out | 10 +-
src/test/regress/sql/collate.icu.utf8.sql | 8 +-
src/test/regress/sql/collate.linux.utf8.sql | 8 +-
src/tools/msvc/Mkvcbuild.pm | 22 +-
55 files changed, 4729 insertions(+), 363 deletions(-)
create mode 100644 src/common/pg_collation_fn_common.c
create mode 100644 src/include/common/pg_collation_fn_common.h
create mode 100644 src/test/default_collation/Makefile
create mode 100644 src/test/default_collation/icu.utf8/.gitignore
create mode 100644 src/test/default_collation/icu.utf8/Makefile
create mode 100644 src/test/default_collation/icu.utf8/t/001_default_collation.pl
create mode 100644 src/test/default_collation/icu/.gitignore
create mode 100644 src/test/default_collation/icu/Makefile
create mode 100644 src/test/default_collation/icu/t/001_default_collation.pl
create mode 100644 src/test/default_collation/libc.utf8/.gitignore
create mode 100644 src/test/default_collation/libc.utf8/Makefile
create mode 100644 src/test/default_collation/libc.utf8/t/001_default_collation.pl
create mode 100644 src/test/default_collation/libc/.gitignore
create mode 100644 src/test/default_collation/libc/Makefile
create mode 100644 src/test/default_collation/libc/t/001_default_collation.pl
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index a6143ef8a7..8a46d3d311 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -537,6 +537,61 @@ SELECT * FROM test1 ORDER BY a || b COLLATE "fr_FR";
a database.
</para>
+ <para>
+ You can specify the default collation provider with the <option>--locale</option>
+ and <option>--lc-collate</option> options of the <xref linkend="app-initdb"/> or
+ <xref linkend="app-createdb"/> commands, as follows:
+<programlisting>
+--locale=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]
+--lc-collate=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]
+</programlisting>
+ where <replaceable>provider</replaceable> can take the <literal>icu</literal>
+ or <literal>libc</literal> value, and <replaceable>locale</replaceable> is specified
+ in the <literal>libc</literal> format. You can only specify a single
+ locale provider after the <literal>@</literal> symbol.
+ The <literal>--lc-collate</literal> option overrides the
+ <literal>--locale</literal> setting, regardless of whether it specifies the
+ collation provider.
+ </para>
+
+ <para>
+ If you omit the collation provider options, <literal>libc</literal>
+ provider is used for <literal>C</literal> and <literal>POSIX</literal>
+ locales. For other locales, the default providers are:
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ <literal>icu</literal> at the cluster level
+ </para>
+ </listitem>
+ <listitem>
+ <para>Default collation provider from the template database at
+ the database level
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <important>
+ <para>
+ You can only use the <literal>icu</literal> collation provider for locales that are
+ supported by <literal>libc</literal> in your operating system and satisfy all
+ restrictions applicable to <literal>icu</literal>.
+ </para>
+ </important>
+
+ <para>
+ When you connect to a database,
+ <productname>PostgreSQL</productname> checks that the selected collation
+ provider and the version of the default collation are supported.
+ You can find the default database collation and the collation provider
+ in <structname>pg_database.datcollate</structname>. For ICU collations, collation version is
+ also stored:
+ <programlisting>
+<replaceable>locale</replaceable>@<replaceable>provider</replaceable>[.<replaceable>version</replaceable>]
+</programlisting>
+ </para>
+
<sect3>
<title>Standard Collations</title>
diff --git a/doc/src/sgml/ref/create_database.sgml b/doc/src/sgml/ref/create_database.sgml
index b2c9e241c2..8b2e153651 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -25,7 +25,7 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
[ [ WITH ] [ OWNER [=] <replaceable class="parameter">user_name</replaceable> ]
[ TEMPLATE [=] <replaceable class="parameter">template</replaceable> ]
[ ENCODING [=] <replaceable class="parameter">encoding</replaceable> ]
- [ LC_COLLATE [=] <replaceable class="parameter">lc_collate</replaceable> ]
+ [ LC_COLLATE [=] <replaceable class="parameter">lc_collate</replaceable>[@<replaceable class="parameter">provider</replaceable>] ]
[ LC_CTYPE [=] <replaceable class="parameter">lc_ctype</replaceable> ]
[ TABLESPACE [=] <replaceable class="parameter">tablespace_name</replaceable> ]
[ ALLOW_CONNECTIONS [=] <replaceable class="parameter">allowconn</replaceable> ]
@@ -112,13 +112,17 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
</listitem>
</varlistentry>
<varlistentry>
- <term><replaceable class="parameter">lc_collate</replaceable></term>
+ <term><replaceable class="parameter">lc_collate</replaceable>[@<replaceable class="parameter">provider</replaceable>]</term>
<listitem>
<para>
Collation order (<literal>LC_COLLATE</literal>) to use in the new database.
This affects the sort order applied to strings, e.g. in queries with
ORDER BY, as well as the order used in indexes on text columns.
The default is to use the collation order of the template database.
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol, as explained in
+ <xref linkend="collation-managing"/>. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>.
See below for additional restrictions.
</para>
</listitem>
diff --git a/doc/src/sgml/ref/createdb.sgml b/doc/src/sgml/ref/createdb.sgml
index 2658efeb1a..dbf87d31ec 100644
--- a/doc/src/sgml/ref/createdb.sgml
+++ b/doc/src/sgml/ref/createdb.sgml
@@ -121,22 +121,34 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>-l <replaceable class="parameter">locale</replaceable></option></term>
- <term><option>--locale=<replaceable class="parameter">locale</replaceable></option></term>
+ <term><option>-l <replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
+ <term><option>--locale=<replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Specifies the locale to be used in this database. This is equivalent
to specifying both <option>--lc-collate</option> and <option>--lc-ctype</option>.
</para>
+
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
<varlistentry>
- <term><option>--lc-collate=<replaceable class="parameter">locale</replaceable></option></term>
+ <term><option>--lc-collate=<replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Specifies the LC_COLLATE setting to be used in this database.
</para>
+
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 4489b585c7..738e41bab4 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -222,7 +222,7 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>--locale=<replaceable>locale</replaceable></option></term>
+ <term><option>--locale=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Sets the default locale for the database cluster. If this
@@ -230,11 +230,16 @@ PostgreSQL documentation
environment that <command>initdb</command> runs in. Locale
support is described in <xref linkend="locale"/>.
</para>
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
<varlistentry>
- <term><option>--lc-collate=<replaceable>locale</replaceable></option></term>
+ <term><option>--lc-collate=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<term><option>--lc-ctype=<replaceable>locale</replaceable></option></term>
<term><option>--lc-messages=<replaceable>locale</replaceable></option></term>
<term><option>--lc-monetary=<replaceable>locale</replaceable></option></term>
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 673a8c2164..dbc527964b 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -327,6 +327,23 @@ make check EXTRA_TESTS='collate.icu.utf8 collate.linux.utf8' LANG=en_US.utf8
</para>
</sect2>
+ <sect2>
+ <title>Extra TAP Tests for Default Collations</title>
+
+ <para>
+ To test the default collations on Linux/glibc platforms,
+ you can run extra TAP tests, as follows:
+<screen>
+make -C src/test/default_collation check-utf8
+</screen>
+ These tests only succeed when run in a database that uses the UTF-8
+ encoding. As these tests are TAP-based, you can only run them if
+ <productname>PostgreSQL</productname> was configured with the
+ <option>--enable-tap-tests</option> option.
+ For details, see <xref linkend="regress-tap"/>.
+ </para>
+ </sect2>
+
<sect2>
<title>Testing Hot Standby</title>
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index f4e69f4a26..8d34006847 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -397,7 +397,7 @@ CREATE VIEW character_sets AS
CAST(c.collname AS sql_identifier) AS default_collate_name
FROM pg_database d
LEFT JOIN (pg_collation c JOIN pg_namespace nc ON (c.collnamespace = nc.oid))
- ON (datcollate = collcollate AND datctype = collctype)
+ ON (datcollate = (collcollate || '@libc') AND datctype = collctype)
WHERE d.datname = current_database()
ORDER BY char_length(c.collname) DESC, c.collname ASC -- prefer full/canonical name
LIMIT 1;
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 8fb51e8c3d..6846ebc586 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -27,6 +27,7 @@
#include "commands/comment.h"
#include "commands/dbcommands.h"
#include "commands/defrem.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "utils/builtins.h"
@@ -162,11 +163,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
if (collproviderstr)
{
- if (pg_strcasecmp(collproviderstr, "icu") == 0)
- collprovider = COLLPROVIDER_ICU;
- else if (pg_strcasecmp(collproviderstr, "libc") == 0)
- collprovider = COLLPROVIDER_LIBC;
- else
+ collprovider = get_collprovider(collproviderstr);
+ if (!is_valid_nondefault_collprovider(collprovider))
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("unrecognized collation provider: %s",
@@ -192,7 +190,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
else
{
collencoding = GetDatabaseEncoding();
- check_encoding_locale_matches(collencoding, collcollate, collctype);
+ check_encoding_locale_matches(collencoding, collcollate, collctype,
+ collprovider);
}
}
@@ -433,26 +432,6 @@ cmpaliases(const void *a, const void *b)
#ifdef USE_ICU
-/*
- * Get the ICU language tag for a locale name.
- * The result is a palloc'd string.
- */
-static char *
-get_icu_language_tag(const char *localename)
-{
- char buf[ULOC_FULLNAME_CAPACITY];
- UErrorCode status;
-
- status = U_ZERO_ERROR;
- uloc_toLanguageTag(localename, buf, sizeof(buf), TRUE, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not convert locale name \"%s\" to language tag: %s",
- localename, u_errorName(status))));
-
- return pstrdup(buf);
-}
-
/*
* Get a comment (specifically, the display name) for an ICU locale.
* The result is a palloc'd string, or NULL if we can't get a comment
@@ -698,7 +677,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
name = uloc_getAvailable(i);
langtag = get_icu_language_tag(name);
- collcollate = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : name;
+ collcollate = get_icu_collate(name, langtag);
/*
* Be paranoid about not allowing any non-ASCII strings into
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 5342f217c0..610fece468 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -34,6 +34,7 @@
#include "catalog/indexing.h"
#include "catalog/objectaccess.h"
#include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
#include "catalog/pg_db_role_setting.h"
#include "catalog/pg_subscription.h"
@@ -44,6 +45,7 @@
#include "commands/defrem.h"
#include "commands/seclabel.h"
#include "commands/tablespace.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -141,6 +143,14 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
int notherbackends;
int npreparedxacts;
createdb_failure_params fparms;
+ char *src_canonname;
+ char src_collprovider;
+ char *dbcanonname = NULL;
+ char dbcollprovider;
+ char *dbcollate_full_name;
+ char *icu_wincollate = NULL;
+ char *langtag = NULL;
+ const char *collate;
/* Extract options from the statement node tree */
foreach(option, stmt->options)
@@ -350,8 +360,28 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
/* If encoding or locales are defaulted, use source's setting */
if (encoding < 0)
encoding = src_encoding;
+
+ check_locale_collprovider(src_collate, &src_canonname, &src_collprovider,
+ NULL);
+
+ if (!is_valid_nondefault_collprovider(src_collprovider))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of template database \"%s\"",
+ src_collate, dbtemplate)));
+
if (dbcollate == NULL)
- dbcollate = src_collate;
+ {
+ dbcollate = src_canonname;
+ dbcollprovider = src_collprovider;
+ }
+ else
+ {
+ check_locale_collprovider(dbcollate, &dbcanonname, &dbcollprovider,
+ NULL);
+ dbcollate = dbcanonname;
+ }
+
if (dbctype == NULL)
dbctype = src_ctype;
@@ -362,18 +392,88 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
errmsg("invalid server encoding %d", encoding)));
/* Check that the chosen locales are valid, and get canonical spellings */
- if (!check_locale(LC_COLLATE, dbcollate, &canonname))
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("invalid locale name: \"%s\"", dbcollate)));
- dbcollate = canonname;
- if (!check_locale(LC_CTYPE, dbctype, &canonname))
+
+ if (!check_locale(LC_CTYPE, dbctype, &canonname, '\0'))
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("invalid locale name: \"%s\"", dbctype)));
dbctype = canonname;
- check_encoding_locale_matches(encoding, dbcollate, dbctype);
+ /* we always check lc_collate for libc */
+ if (!check_locale(LC_COLLATE, dbcollate, &canonname, COLLPROVIDER_LIBC))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("invalid locale name: \"%s\" (provider \"%s\")",
+ dbcollate, get_collprovider_name(COLLPROVIDER_LIBC))));
+ dbcollate = canonname;
+
+ /* determine the collation provider if we haven't already done it */
+ if (!is_valid_nondefault_collprovider(dbcollprovider))
+ {
+ if (locale_is_c(dbcollate))
+ dbcollprovider = COLLPROVIDER_LIBC;
+ else
+ dbcollprovider = src_collprovider;
+ }
+
+ Assert(is_valid_nondefault_collprovider(dbcollprovider));
+
+#ifndef USE_ICU
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"),
+ errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif
+
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ {
+ if (!check_locale(LC_COLLATE, dbcollate, NULL, dbcollprovider))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("invalid locale name: \"%s\" (provider \"%s\")",
+ dbcollate, get_collprovider_name(dbcollprovider))));
+
+ if (strcmp(dbcollate, dbctype) != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("collations with different collate and ctype values are not supported by ICU")));
+ }
+
+ check_encoding_locale_matches(encoding, dbcollate, dbctype, dbcollprovider);
+
+ /* get the collation version */
+
+#ifdef USE_ICU
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ {
+ collate = (const char *) dbcollate;
+#ifdef WIN32
+ if (!locale_is_c(collate))
+ {
+ icu_wincollate = check_icu_winlocale(collate);
+ collate = (const char *) icu_wincollate;
+ }
+#endif /* WIN32 */
+ langtag = get_icu_language_tag(collate);
+ collate = get_icu_collate(collate, langtag);
+ }
+ else
+#endif /* USE_ICU */
+ {
+ /* COLLPROVIDER_LIBC */
+ collate = (const char *) dbcollate;
+ }
+
+ dbcollate_full_name = get_full_collation_name(
+ dbcollate, dbcollprovider,
+ get_collation_actual_version(dbcollprovider, collate));
+
+ if (strlen(dbcollate_full_name) >= NAMEDATALEN)
+ ereport(ERROR,
+ (errmsg("the full database collation name \"%s\" is too long",
+ dbcollate_full_name)));
/*
* Check that the new encoding and locale settings match the source
@@ -395,11 +495,11 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
pg_encoding_to_char(src_encoding)),
errhint("Use the same encoding as in the template database, or use template0 as template.")));
- if (strcmp(dbcollate, src_collate) != 0)
+ if (strcmp(dbcollate_full_name, src_collate) != 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("new collation (%s) is incompatible with the collation of the template database (%s)",
- dbcollate, src_collate),
+ dbcollate_full_name, src_collate),
errhint("Use the same collation as in the template database, or use template0 as template.")));
if (strcmp(dbctype, src_ctype) != 0)
@@ -522,7 +622,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
new_record[Anum_pg_database_datdba - 1] = ObjectIdGetDatum(datdba);
new_record[Anum_pg_database_encoding - 1] = Int32GetDatum(encoding);
new_record[Anum_pg_database_datcollate - 1] =
- DirectFunctionCall1(namein, CStringGetDatum(dbcollate));
+ DirectFunctionCall1(namein, CStringGetDatum(dbcollate_full_name));
new_record[Anum_pg_database_datctype - 1] =
DirectFunctionCall1(namein, CStringGetDatum(dbctype));
new_record[Anum_pg_database_datistemplate - 1] = BoolGetDatum(dbistemplate);
@@ -690,6 +790,16 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
*/
ForceSyncCommit();
}
+
+ pfree(src_canonname);
+ pfree(dbcollate_full_name);
+ if (dbcanonname)
+ pfree(dbcanonname);
+ if (langtag)
+ pfree(langtag);
+ if (icu_wincollate)
+ pfree(icu_wincollate);
+
PG_END_ENSURE_ERROR_CLEANUP(createdb_failure_callback,
PointerGetDatum(&fparms));
@@ -719,7 +829,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
* Note: if you change this policy, fix initdb to match.
*/
void
-check_encoding_locale_matches(int encoding, const char *collate, const char *ctype)
+check_encoding_locale_matches(int encoding, const char *collate, const char *ctype,
+ char collprovider)
{
int ctype_encoding = pg_get_encoding_from_locale(ctype, true);
int collate_encoding = pg_get_encoding_from_locale(collate, true);
@@ -753,6 +864,23 @@ check_encoding_locale_matches(int encoding, const char *collate, const char *cty
collate),
errdetail("The chosen LC_COLLATE setting requires encoding \"%s\".",
pg_encoding_to_char(collate_encoding))));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ if (!(is_encoding_supported_by_icu(encoding) ||
+ (encoding == PG_SQL_ASCII && superuser())))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("encoding \"%s\" is not supported for ICU locales",
+ pg_encoding_to_char(encoding))));
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"),
+ errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif
+ }
}
/* Error cleanup callback for createdb */
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 38853e38eb..cb27d626bd 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -32,6 +32,7 @@
#endif
#include "bootstrap/bootstrap.h"
+#include "catalog/pg_collation.h"
#include "common/username.h"
#include "port/atomics.h"
#include "postmaster/postmaster.h"
@@ -306,8 +307,8 @@ startup_hacks(const char *progname)
static void
init_locale(const char *categoryname, int category, const char *locale)
{
- if (pg_perm_setlocale(category, locale) == NULL &&
- pg_perm_setlocale(category, "C") == NULL)
+ if (pg_perm_setlocale(category, locale, COLLPROVIDER_LIBC) == NULL &&
+ pg_perm_setlocale(category, "C", COLLPROVIDER_LIBC) == NULL)
elog(FATAL, "could not adopt \"%s\" locale nor C locale for %s",
locale, categoryname);
}
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index acbed2eeed..e836553f64 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -16,6 +16,7 @@
*/
#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
#include "utils/pg_locale.h"
/*
@@ -240,8 +241,13 @@ pg_set_regex_collation(Oid collation)
}
else
{
+ char collprovider;
+
if (collation == DEFAULT_COLLATION_OID)
+ {
pg_regex_locale = 0;
+ collprovider = get_default_collprovider();
+ }
else if (OidIsValid(collation))
{
/*
@@ -250,6 +256,7 @@ pg_set_regex_collation(Oid collation)
* have to be considered below.
*/
pg_regex_locale = pg_newlocale_from_collation(collation);
+ collprovider = pg_regex_locale->provider;
}
else
{
@@ -263,24 +270,35 @@ pg_set_regex_collation(Oid collation)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
#ifdef USE_ICU
- if (pg_regex_locale && pg_regex_locale->provider == COLLPROVIDER_ICU)
pg_regex_strategy = PG_REGEX_LOCALE_ICU;
- else
+#else
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- if (pg_regex_locale)
- pg_regex_strategy = PG_REGEX_LOCALE_WIDE_L;
- else
- pg_regex_strategy = PG_REGEX_LOCALE_WIDE;
}
else
{
- if (pg_regex_locale)
- pg_regex_strategy = PG_REGEX_LOCALE_1BYTE_L;
+ /* COLLPROVIDER_LIBC */
+
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ if (pg_regex_locale)
+ pg_regex_strategy = PG_REGEX_LOCALE_WIDE_L;
+ else
+ pg_regex_strategy = PG_REGEX_LOCALE_WIDE;
+ }
else
- pg_regex_strategy = PG_REGEX_LOCALE_1BYTE;
+ {
+ if (pg_regex_locale)
+ pg_regex_strategy = PG_REGEX_LOCALE_1BYTE_L;
+ else
+ pg_regex_strategy = PG_REGEX_LOCALE_1BYTE;
+ }
}
pg_regex_collation = collation;
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 4118b78ae4..4cc950f767 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -1479,7 +1479,7 @@ typedef int32_t (*ICU_Convert_Func) (UChar *dest, int32_t destCapacity,
UErrorCode *pErrorCode);
static int32_t
-icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
+icu_convert_case(ICU_Convert_Func func, const char *locale,
UChar **buff_dest, UChar *buff_source, int32_t len_source)
{
UErrorCode status;
@@ -1489,7 +1489,7 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
*buff_dest = palloc(len_dest * sizeof(**buff_dest));
status = U_ZERO_ERROR;
len_dest = func(*buff_dest, len_dest, buff_source, len_source,
- mylocale->info.icu.locale, &status);
+ locale, &status);
if (status == U_BUFFER_OVERFLOW_ERROR)
{
/* try again with adjusted length */
@@ -1497,7 +1497,7 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
*buff_dest = palloc(len_dest * sizeof(**buff_dest));
status = U_ZERO_ERROR;
len_dest = func(*buff_dest, len_dest, buff_source, len_source,
- mylocale->info.icu.locale, &status);
+ locale, &status);
}
if (U_FAILURE(status))
ereport(ERROR,
@@ -1555,8 +1555,15 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1570,25 +1577,43 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar;
int32_t len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToLower, mylocale,
+ len_conv = icu_convert_case(u_strToLower, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
@@ -1677,8 +1702,15 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1692,25 +1724,43 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar,
len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToUpper, mylocale,
+ len_conv = icu_convert_case(u_strToUpper, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
@@ -1800,8 +1850,15 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1815,25 +1872,43 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar,
len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale,
+ len_conv = icu_convert_case(u_strToTitle_default_BI, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index ff716c5f58..28ea64ffa0 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -167,6 +167,9 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
plen;
pg_locale_t locale = 0;
bool locale_is_c = false;
+ char collprovider = COLLPROVIDER_LIBC;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY;
+ bool use_icu;
if (lc_ctype_is_c(collation))
locale_is_c = true;
@@ -184,7 +187,18 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collation);
+ collprovider = locale->provider;
}
+ else
+ {
+ collprovider = get_default_collprovider();
+ }
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
/*
* For efficiency reasons, in the single byte case we don't call lower()
@@ -194,7 +208,7 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
* way.
*/
- if (pg_database_encoding_max_length() > 1 || (locale && locale->provider == COLLPROVIDER_ICU))
+ if (pg_database_encoding_max_length() > 1 || use_icu)
{
/* lower's result is never packed, so OK to use old macros here */
pat = DatumGetTextPP(DirectFunctionCall1Coll(lower, collation,
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index a3dc3be5a8..5d7c66bc9b 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -56,7 +56,10 @@
#include "access/htup_details.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_control.h"
+#include "catalog/pg_database.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
+#include "miscadmin.h"
#include "utils/builtins.h"
#include "utils/hsearch.h"
#include "utils/lsyscache.h"
@@ -132,6 +135,10 @@ static HTAB *collation_cache = NULL;
static char *IsoLocaleName(const char *); /* MSVC specific */
#endif
+#ifdef USE_ICU
+static char *check_icu_locale(const char *locale);
+#endif
+
/*
* pg_perm_setlocale
@@ -146,13 +153,45 @@ static char *IsoLocaleName(const char *); /* MSVC specific */
* also be unset to fully ensure that, but that has to be done elsewhere after
* all the individual LC_XXX variables have been set correctly. (Thank you
* Perl for making this kluge necessary.)
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
-char *
-pg_perm_setlocale(int category, const char *locale)
+const char *
+pg_perm_setlocale(int category, const char *locale, char collprovider)
{
- char *result;
+ const char *result;
const char *envvar;
char *envbuf;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
+ {
+#ifdef USE_ICU
+ UErrorCode status = U_ZERO_ERROR;
+ char *icu_locale = check_icu_locale(locale);
+
+ if (icu_locale == NULL && locale != NULL)
+ return NULL; /* fall out immediately on failure */
+
+ uloc_setDefault(icu_locale, &status);
+ if (U_FAILURE(status))
+ return NULL; /* fall out immediately on failure */
+
+ result = uloc_getDefault();
+ if (icu_locale)
+ pfree(icu_locale);
+ return result;
+#else /* not USE_ICU */
+ return NULL; /* fall out immediately on failure */
+#endif /* not USE_ICU */
+ }
+
+ /* use libc */
#ifndef WIN32
result = setlocale(category, locale);
@@ -167,7 +206,7 @@ pg_perm_setlocale(int category, const char *locale)
#ifdef LC_MESSAGES
if (category == LC_MESSAGES)
{
- result = (char *) locale;
+ result = locale;
if (locale == NULL || locale[0] == '\0')
return result;
}
@@ -218,7 +257,7 @@ pg_perm_setlocale(int category, const char *locale)
#ifdef WIN32
result = IsoLocaleName(locale);
if (result == NULL)
- result = (char *) locale;
+ result = locale;
#endif /* WIN32 */
break;
#endif /* LC_MESSAGES */
@@ -259,34 +298,102 @@ pg_perm_setlocale(int category, const char *locale)
* it seems that on most implementations that's the only thing it's good for;
* we could wish that setlocale gave back a canonically spelled version of
* the locale name, but typically it doesn't.)
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
bool
-check_locale(int category, const char *locale, char **canonname)
+check_locale(int category, const char *locale, char **canonname,
+ char collprovider)
{
- char *save;
- char *res;
+ const char *save;
+ const char *res;
+ char *save_dup;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+#ifdef USE_ICU
+ UErrorCode status;
+ char *icu_locale;
+#endif
+
+ Assert(use_libc || use_icu);
if (canonname)
*canonname = NULL; /* in case of failure */
- save = setlocale(category, NULL);
- if (!save)
- return false; /* won't happen, we hope */
+#ifndef USE_ICU
+ /* cannot use icu functions */
+ if (use_icu)
+ return false;
+#endif
+
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ save = uloc_getDefault();
+ if (!save)
+ return false; /* won't happen, we hope */
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ save = setlocale(category, NULL);
+ if (!save)
+ return false; /* won't happen, we hope */
+ }
/* save may be pointing at a modifiable scratch variable, see above. */
- save = pstrdup(save);
+ save_dup = pstrdup(save);
/* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ icu_locale = check_icu_locale(locale);
+
+ if (icu_locale == NULL && locale != NULL)
+ return false; /* won't happen, we hope */
+
+ status = U_ZERO_ERROR;
+ uloc_setDefault(icu_locale, &status);
+ if (U_FAILURE(status))
+ return false; /* won't happen, we hope */
+
+ res = uloc_getDefault();
+ if (icu_locale)
+ pfree(icu_locale);
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ res = setlocale(category, locale);
+ }
/* save canonical name if requested. */
if (res && canonname)
*canonname = pstrdup(res);
/* restore old value. */
- if (!setlocale(category, save))
- elog(WARNING, "failed to restore old locale \"%s\"", save);
- pfree(save);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ elog(WARNING, "ICU error: failed to restore old locale \"%s\"",
+ save_dup);
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ if (!setlocale(category, save_dup))
+ elog(WARNING, "failed to restore old locale \"%s\"", save_dup);
+ }
+ pfree(save_dup);
return (res != NULL);
}
@@ -306,7 +413,7 @@ check_locale(int category, const char *locale, char **canonname)
bool
check_locale_monetary(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_MONETARY, *newval, NULL);
+ return check_locale(LC_MONETARY, *newval, NULL, '\0');
}
void
@@ -318,7 +425,7 @@ assign_locale_monetary(const char *newval, void *extra)
bool
check_locale_numeric(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_NUMERIC, *newval, NULL);
+ return check_locale(LC_NUMERIC, *newval, NULL, '\0');
}
void
@@ -330,7 +437,7 @@ assign_locale_numeric(const char *newval, void *extra)
bool
check_locale_time(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_TIME, *newval, NULL);
+ return check_locale(LC_TIME, *newval, NULL, '\0');
}
void
@@ -366,7 +473,7 @@ check_locale_messages(char **newval, void **extra, GucSource source)
* On Windows, we can't even check the value, so accept blindly
*/
#if defined(LC_MESSAGES) && !defined(WIN32)
- return check_locale(LC_MESSAGES, *newval, NULL);
+ return check_locale(LC_MESSAGES, *newval, NULL, '\0');
#else
return true;
#endif
@@ -380,7 +487,7 @@ assign_locale_messages(const char *newval, void *extra)
* We ignore failure, as per comment above.
*/
#ifdef LC_MESSAGES
- (void) pg_perm_setlocale(LC_MESSAGES, newval);
+ (void) pg_perm_setlocale(LC_MESSAGES, newval, '\0');
#endif
}
@@ -1096,21 +1203,14 @@ lookup_collation_cache(Oid collation, bool set_flags)
/* Attempt to set the flags */
HeapTuple tp;
Form_pg_collation collform;
- const char *collcollate;
- const char *collctype;
tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collation));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for collation %u", collation);
collform = (Form_pg_collation) GETSTRUCT(tp);
- collcollate = NameStr(collform->collcollate);
- collctype = NameStr(collform->collctype);
-
- cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
- (strcmp(collcollate, "POSIX") == 0));
- cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
- (strcmp(collctype, "POSIX") == 0));
+ cache_entry->collate_is_c = locale_is_c(NameStr(collform->collcollate));
+ cache_entry->ctype_is_c = locale_is_c(NameStr(collform->collctype));
cache_entry->flags_valid = true;
@@ -1141,20 +1241,28 @@ lc_collate_is_c(Oid collation)
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
+ char collprovider;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_COLLATE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_COLLATE setting");
-
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
+
+ collprovider = get_default_collprovider();
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
result = false;
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ char *localeptr = setlocale(LC_COLLATE, NULL);
+
+ if (!localeptr)
+ elog(ERROR, "invalid LC_COLLATE setting");
+
+ result = locale_is_c(localeptr);
+ }
return (bool) result;
}
@@ -1191,20 +1299,28 @@ lc_ctype_is_c(Oid collation)
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
+ char collprovider;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_CTYPE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_CTYPE setting");
-
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
+
+ collprovider = get_default_collprovider();
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
result = false;
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ char *localeptr = setlocale(LC_CTYPE, NULL);
+
+ if (!localeptr)
+ elog(ERROR, "invalid LC_CTYPE setting");
+
+ result = locale_is_c(localeptr);
+ }
return (bool) result;
}
@@ -1365,25 +1481,15 @@ pg_newlocale_from_collation(Oid collid)
else if (collform->collprovider == COLLPROVIDER_ICU)
{
#ifdef USE_ICU
- UCollator *collator;
- UErrorCode status;
-
if (strcmp(collcollate, collctype) != 0)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("collations with different collate and ctype values are not supported by ICU")));
- status = U_ZERO_ERROR;
- collator = ucol_open(collcollate, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not open collator for locale \"%s\": %s",
- collcollate, u_errorName(status))));
-
/* We will leak this string if we get an error below :-( */
result.info.icu.locale = MemoryContextStrdup(TopMemoryContext,
collcollate);
- result.info.icu.ucol = collator;
+ result.info.icu.ucol = open_collator(collcollate);
#else /* not USE_ICU */
/* could get here if a collation was created by a build with ICU */
ereport(ERROR,
@@ -1440,46 +1546,6 @@ pg_newlocale_from_collation(Oid collid)
return cache_entry->locale;
}
-/*
- * Get provider-specific collation version string for the given collation from
- * the operating system/library.
- *
- * A particular provider must always either return a non-NULL string or return
- * NULL (if it doesn't support versions). It must not return NULL for some
- * collcollate and not NULL for others.
- */
-char *
-get_collation_actual_version(char collprovider, const char *collcollate)
-{
- char *collversion;
-
-#ifdef USE_ICU
- if (collprovider == COLLPROVIDER_ICU)
- {
- UCollator *collator;
- UErrorCode status;
- UVersionInfo versioninfo;
- char buf[U_MAX_VERSION_STRING_LENGTH];
-
- status = U_ZERO_ERROR;
- collator = ucol_open(collcollate, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not open collator for locale \"%s\": %s",
- collcollate, u_errorName(status))));
- ucol_getVersion(collator, versioninfo);
- ucol_close(collator);
-
- u_versionToString(versioninfo, buf);
- collversion = pstrdup(buf);
- }
- else
-#endif
- collversion = NULL;
-
- return collversion;
-}
-
#ifdef USE_ICU
/*
@@ -1761,3 +1827,125 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
return result;
}
+
+#ifdef USE_ICU
+/*
+ * If locale is "" return the environment value from setlocale().
+ *
+ * Otherwise return a malloc'd copy of locale if it is not NULL.
+ */
+static char *
+check_icu_locale(const char *locale)
+{
+ char *canonname = NULL;
+ char *winlocale = NULL;
+ char *result;
+
+ /* Windows locales can be in the format ".codepage" */
+ if (locale && (strlen(locale) == 0 || locale[0] == '.'))
+ {
+ check_locale(LC_COLLATE, locale, &canonname, COLLPROVIDER_LIBC);
+ locale = (const char *) canonname;
+ }
+
+#ifdef WIN32
+ if (!locale_is_c(locale))
+ {
+ winlocale = check_icu_winlocale(locale);
+ locale = (const char *) winlocale;
+ }
+#endif
+
+ result = locale ? pstrdup(locale) : NULL;
+
+ if (canonname)
+ pfree(canonname);
+ if (winlocale)
+ pfree(winlocale);
+
+ return result;
+}
+
+/*
+ * Get the default icu collation.
+ */
+const char *
+get_icu_default_collate(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static char result[NAMEDATALEN];
+ static bool cached = false;
+ const char *locale,
+ *collate;
+ char *langtag;
+
+ if (cached)
+ return result;
+
+ locale = uloc_getDefault();
+ if (!locale)
+ ereport(ERROR, (errmsg("ICU error: uloc_getDefault() failed")));
+
+ langtag = get_icu_language_tag(locale);
+ collate = get_icu_collate(locale, langtag);
+
+ if (strlen(collate) >= NAMEDATALEN)
+ ereport(FATAL,
+ (errmsg("the default ICU collation name \"%s\" is too long", collate)));
+
+ strcpy(result, collate);
+ cached = true;
+
+ pfree(langtag);
+ return result;
+}
+
+/*
+ * Get the collator for the default ICU collation.
+ */
+UCollator *
+get_default_collation_collator(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static UCollator *collator = NULL;
+
+ if (collator)
+ return collator;
+
+ collator = open_collator(get_icu_default_collate());
+ return collator;
+}
+#endif /* USE_ICU */
+
+/*
+ * Get the default collation provider.
+ */
+char
+get_default_collprovider(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static char result = '\0';
+ HeapTuple tp;
+ Form_pg_database dbform;
+ char *datcollate;
+
+ if (result)
+ return result;
+
+ tp = SearchSysCache1(DATABASEOID, ObjectIdGetDatum(MyDatabaseId));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for database %u", MyDatabaseId);
+
+ dbform = (Form_pg_database) GETSTRUCT(tp);
+ datcollate = NameStr(dbform->datcollate);
+ check_locale_collprovider(datcollate, NULL, &result, NULL);
+
+ if (!is_valid_nondefault_collprovider(result))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of database \"%s\"",
+ datcollate, NameStr(dbform->datname))));
+
+ ReleaseSysCache(tp);
+ return result;
+}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index e0ece74bb9..ff7279a836 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5719,13 +5719,14 @@ find_join_input_rel(PlannerInfo *root, Relids relids)
*/
static int
pattern_char_isalpha(char c, bool is_multibyte,
- pg_locale_t locale, bool locale_is_c)
+ pg_locale_t locale, char collprovider, bool locale_is_c)
{
if (locale_is_c)
return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
else if (is_multibyte && IS_HIGHBIT_SET(c))
return true;
- else if (locale && locale->provider == COLLPROVIDER_ICU)
+ else if (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII)
return IS_HIGHBIT_SET(c) ? true : false;
#ifdef HAVE_LOCALE_T
else if (locale && locale->provider == COLLPROVIDER_LIBC)
@@ -5761,6 +5762,7 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
bool is_multibyte = (pg_database_encoding_max_length() > 1);
pg_locale_t locale = 0;
bool locale_is_c = false;
+ char collprovider = COLLPROVIDER_LIBC;
/* the right-hand const is type text or bytea */
Assert(typeid == BYTEAOID || typeid == TEXTOID);
@@ -5789,6 +5791,11 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collation);
+ collprovider = locale->provider;
+ }
+ else
+ {
+ collprovider = get_default_collprovider();
}
}
@@ -5826,7 +5833,8 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
/* Stop if case-varying character (it's sort of a wildcard) */
if (case_insensitive &&
- pattern_char_isalpha(patt[pos], is_multibyte, locale, locale_is_c))
+ pattern_char_isalpha(patt[pos], is_multibyte, locale,
+ collprovider, locale_is_c))
break;
match[match_pos++] = patt[pos];
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 0fd3b15748..a9207d4cdc 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1401,8 +1401,15 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
char *a1p,
*a2p;
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1416,8 +1423,15 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
/*
* memcmp() can't tell us which of two unequal strings sorts first,
* but it's a cheap way to tell if they're equal. Testing shows that
@@ -1432,8 +1446,7 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
#ifdef WIN32
/* Win32 does not have UTF-8, so we need to map to UTF-16 */
- if (GetDatabaseEncoding() == PG_UTF8
- && (!mylocale || mylocale->provider == COLLPROVIDER_LIBC))
+ if (GetDatabaseEncoding() == PG_UTF8 && use_libc)
{
int a1len;
int a2len;
@@ -1535,60 +1548,67 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
memcpy(a2p, arg2, len2);
a2p[len2] = '\0';
- if (mylocale)
+ if (use_icu)
{
- if (mylocale->provider == COLLPROVIDER_ICU)
- {
#ifdef USE_ICU
+ UCollator *collator;
+
+ if (mylocale)
+ collator = mylocale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
+
#ifdef HAVE_UCOL_STRCOLLUTF8
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- UErrorCode status;
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ UErrorCode status;
- status = U_ZERO_ERROR;
- result = ucol_strcollUTF8(mylocale->info.icu.ucol,
- arg1, len1,
- arg2, len2,
- &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("collation failed: %s", u_errorName(status))));
- }
- else
+ status = U_ZERO_ERROR;
+ result = ucol_strcollUTF8(collator,
+ arg1, len1,
+ arg2, len2,
+ &status);
+ if (U_FAILURE(status))
+ ereport(ERROR,
+ (errmsg("collation failed: %s", u_errorName(status))));
+ }
+ else
#endif
- {
- int32_t ulen1,
- ulen2;
- UChar *uchar1,
- *uchar2;
+ {
+ int32_t ulen1,
+ ulen2;
+ UChar *uchar1,
+ *uchar2;
- ulen1 = icu_to_uchar(&uchar1, arg1, len1);
- ulen2 = icu_to_uchar(&uchar2, arg2, len2);
+ ulen1 = icu_to_uchar(&uchar1, arg1, len1);
+ ulen2 = icu_to_uchar(&uchar2, arg2, len2);
- result = ucol_strcoll(mylocale->info.icu.ucol,
- uchar1, ulen1,
- uchar2, ulen2);
+ result = ucol_strcoll(collator,
+ uchar1, ulen1,
+ uchar2, ulen2);
- pfree(uchar1);
- pfree(uchar2);
- }
+ pfree(uchar1);
+ pfree(uchar2);
+ }
#else /* not USE_ICU */
- /* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif /* not USE_ICU */
- }
- else
- {
+ }
+ else
+ {
+ /* use_libc */
+
+ if (mylocale)
#ifdef HAVE_LOCALE_T
result = strcoll_l(a1p, a2p, mylocale->info.lt);
#else
/* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- }
+ else
+ result = strcoll(a1p, a2p);
}
- else
- result = strcoll(a1p, a2p);
/*
* In some locales strcoll() can claim that nonidentical strings are
@@ -1838,6 +1858,9 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
bool collate_c = false;
VarStringSortSupport *sss;
pg_locale_t locale = 0;
+ char collprovider = '\0';
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY = false;
+ bool use_icu = false;
/*
* If possible, set ssup->comparator to a function which can be used to
@@ -1867,7 +1890,11 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
* we'll figure out the collation based on the locale id and cache the
* result.
*/
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1881,8 +1908,15 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collid);
+ collprovider = locale->provider;
}
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
/*
* There is a further exception on Windows. When the database
* encoding is UTF-8 and we are not using the C collation, complex
@@ -1892,8 +1926,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
* trampoline. ICU locales work just the same on Windows, however.
*/
#ifdef WIN32
- if (GetDatabaseEncoding() == PG_UTF8 &&
- !(locale && locale->provider == COLLPROVIDER_ICU))
+ if (GetDatabaseEncoding() == PG_UTF8 && use_libc)
return;
#endif
@@ -1922,7 +1955,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
* platforms.
*/
#ifndef TRUST_STRXFRM
- if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
+ if (!collate_c && !use_icu)
abbreviate = false;
#endif
@@ -2064,6 +2097,9 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
VarString *arg2 = DatumGetVarStringPP(y);
bool arg1_match;
VarStringSortSupport *sss = (VarStringSortSupport *) ssup->ssup_extra;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
/* working state */
char *a1p,
@@ -2157,59 +2193,77 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
}
if (sss->locale)
+ collprovider = sss->locale->provider;
+ else
+ collprovider = get_default_collprovider();
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
- if (sss->locale->provider == COLLPROVIDER_ICU)
- {
#ifdef USE_ICU
-#ifdef HAVE_UCOL_STRCOLLUTF8
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- UErrorCode status;
+ UCollator *collator;
- status = U_ZERO_ERROR;
- result = ucol_strcollUTF8(sss->locale->info.icu.ucol,
- a1p, len1,
- a2p, len2,
- &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("collation failed: %s", u_errorName(status))));
- }
- else
+ if (sss->locale)
+ collator = sss->locale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
+
+#ifdef HAVE_UCOL_STRCOLLUTF8
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ UErrorCode status;
+
+ status = U_ZERO_ERROR;
+ result = ucol_strcollUTF8(collator,
+ a1p, len1,
+ a2p, len2,
+ &status);
+ if (U_FAILURE(status))
+ ereport(ERROR,
+ (errmsg("collation failed: %s", u_errorName(status))));
+ }
+ else
#endif
- {
- int32_t ulen1,
- ulen2;
- UChar *uchar1,
- *uchar2;
+ {
+ int32_t ulen1,
+ ulen2;
+ UChar *uchar1,
+ *uchar2;
- ulen1 = icu_to_uchar(&uchar1, a1p, len1);
- ulen2 = icu_to_uchar(&uchar2, a2p, len2);
+ ulen1 = icu_to_uchar(&uchar1, a1p, len1);
+ ulen2 = icu_to_uchar(&uchar2, a2p, len2);
- result = ucol_strcoll(sss->locale->info.icu.ucol,
- uchar1, ulen1,
- uchar2, ulen2);
+ result = ucol_strcoll(collator,
+ uchar1, ulen1,
+ uchar2, ulen2);
- pfree(uchar1);
- pfree(uchar2);
- }
+ pfree(uchar1);
+ pfree(uchar2);
+ }
#else /* not USE_ICU */
- /* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif /* not USE_ICU */
- }
- else
- {
+ }
+ else
+ {
+ /* use_libc */
+
+ if (sss->locale)
#ifdef HAVE_LOCALE_T
result = strcoll_l(sss->buf1, sss->buf2, sss->locale->info.lt);
#else
/* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- }
+ else
+ result = strcoll(sss->buf1, sss->buf2);
}
- else
- result = strcoll(sss->buf1, sss->buf2);
/*
* In some locales strcoll() can claim that nonidentical strings are
@@ -2314,6 +2368,9 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
else
{
Size bsize;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
#ifdef USE_ICU
int32_t ulen = -1;
UChar *uchar = NULL;
@@ -2350,10 +2407,20 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
sss->buf1[len] = '\0';
sss->last_len1 = len;
+ if (sss->locale)
+ collprovider = sss->locale->provider;
+ else
+ collprovider = get_default_collprovider();
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
#ifdef USE_ICU
/* When using ICU and not UTF8, convert string to UChar. */
- if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU &&
- GetDatabaseEncoding() != PG_UTF8)
+ if (use_icu && GetDatabaseEncoding() != PG_UTF8)
ulen = icu_to_uchar(&uchar, sss->buf1, len);
#endif
@@ -2367,9 +2434,15 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
*/
for (;;)
{
-#ifdef USE_ICU
- if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+ if (use_icu)
{
+#ifdef USE_ICU
+ UCollator *collator;
+
+ if (sss->locale)
+ collator = sss->locale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
/*
* When using UTF8, use the iteration interface so we only
* need to produce as many bytes as we actually need.
@@ -2383,7 +2456,7 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
uiter_setUTF8(&iter, sss->buf1, len);
state[0] = state[1] = 0; /* won't need that again */
status = U_ZERO_ERROR;
- bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol,
+ bsize = ucol_nextSortKeyPart(collator,
&iter,
state,
(uint8_t *) sss->buf2,
@@ -2395,19 +2468,26 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
u_errorName(status))));
}
else
- bsize = ucol_getSortKey(sss->locale->info.icu.ucol,
+ bsize = ucol_getSortKey(collator,
uchar, ulen,
(uint8_t *) sss->buf2, sss->buflen2);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
+ {
+ /* use_libc */
+
#ifdef HAVE_LOCALE_T
- if (sss->locale && sss->locale->provider == COLLPROVIDER_LIBC)
- bsize = strxfrm_l(sss->buf2, sss->buf1,
- sss->buflen2, sss->locale->info.lt);
- else
+ if (sss->locale)
+ bsize = strxfrm_l(sss->buf2, sss->buf1,
+ sss->buflen2, sss->locale->info.lt);
+ else
#endif
- bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
+ bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
+ }
sss->last_len2 = bsize;
if (bsize < sss->buflen2)
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 4f1d2a0d28..9d9bf38f9a 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -29,9 +29,11 @@
#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
#include "catalog/pg_db_role_setting.h"
#include "catalog/pg_tablespace.h"
+#include "common/pg_collation_fn_common.h"
#include "libpq/auth.h"
#include "libpq/libpq-be.h"
#include "mb/pg_wchar.h"
@@ -318,6 +320,13 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
Form_pg_database dbform;
char *collate;
char *ctype;
+ char *datcollate;
+ char collprovider;
+ char *collversion;
+ char *wincollate = NULL;
+ char *langtag = NULL;
+ const char *collcollate;
+ char *actual_versionstr;
/* Fetch our pg_database row normally, via syscache */
tup = SearchSysCache1(DATABASEOID, ObjectIdGetDatum(MyDatabaseId));
@@ -399,27 +408,124 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
PGC_BACKEND, PGC_S_DYNAMIC_DEFAULT);
/* assign locale variables */
- collate = NameStr(dbform->datcollate);
ctype = NameStr(dbform->datctype);
+ datcollate = NameStr(dbform->datcollate);
+ check_locale_collprovider(datcollate, &collate, &collprovider,
+ &collversion);
- if (pg_perm_setlocale(LC_COLLATE, collate) == NULL)
+ if (!is_valid_nondefault_collprovider(collprovider))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of database \"%s\"",
+ datcollate, name)));
+
+#ifndef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ ereport(FATAL,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"), \
+ errhint("Recreate the database with libc locale or rebuild PostgreSQL using --with-icu.")));
+#endif
+
+ /* we always check lc_collate for libc */
+ if (pg_perm_setlocale(LC_COLLATE, collate, COLLPROVIDER_LIBC) == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
- errdetail("The database was initialized with LC_COLLATE \"%s\", "
- " which is not recognized by setlocale().", collate),
+ errdetail("The database was initialized with LC_COLLATE \"%s\" (provider \"%s\"), "
+ " which is not recognized by setlocale().",
+ collate, get_collprovider_name(COLLPROVIDER_LIBC)),
errhint("Recreate the database with another locale or install the missing locale.")));
- if (pg_perm_setlocale(LC_CTYPE, ctype) == NULL)
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ if (pg_perm_setlocale(LC_COLLATE, collate, collprovider) == NULL)
+ ereport(FATAL,
+ (errmsg("database locale is incompatible with operating system"),
+ errdetail("The database was initialized with LC_COLLATE \"%s\" (provider \"%s\"), "
+ " which is not recognized by uloc_setDefault().",
+ collate, get_collprovider_name(collprovider)),
+ errhint("Recreate the database with another locale or install the missing locale.")));
+
+ /* This could happen when manually creating a mess in the catalogs. */
+ if (strcmp(collate, ctype) != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("collations with different collate and ctype values are not supported by ICU")));
+ }
+
+ if (pg_perm_setlocale(LC_CTYPE, ctype, '\0') == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
errdetail("The database was initialized with LC_CTYPE \"%s\", "
" which is not recognized by setlocale().", ctype),
errhint("Recreate the database with another locale or install the missing locale.")));
+ /* get the actual version of the collation */
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ collcollate = (const char *) collate;
+#ifdef WIN32
+ if (!locale_is_c(collcollate))
+ {
+ wincollate = check_icu_winlocale(collcollate);
+ collcollate = (const char *) wincollate;
+ }
+#endif /* WIN32 */
+ langtag = get_icu_language_tag(collcollate);
+ collcollate = get_icu_collate(collcollate, langtag);
+ }
+ else
+#endif /* USE_ICU */
+ {
+ /* COLLPROVIDER_LIBC */
+ collcollate = (const char *) collate;
+ }
+
+ actual_versionstr = get_collation_actual_version(collprovider, collcollate);
+
+ /*
+ * Check the collation version (this matches the version checking in the
+ * function pg_newlocale_from_collation())
+ */
+ if (collversion)
+ {
+ if (!actual_versionstr)
+ {
+ /*
+ * This could happen when manually creating a mess in the catalogs.
+ */
+ ereport(ERROR,
+ (errmsg("collation \"%s\" (provider \"%s\") has no actual version, but a version was specified",
+ collate, get_collprovider_name(collprovider))));
+ }
+
+ if (strcmp(actual_versionstr, collversion) != 0)
+ ereport(ERROR,
+ (errmsg("collation \"%s\" (provider \"%s\") has version mismatch",
+ collate, get_collprovider_name(collprovider)),
+ errdetail("The collation in the database was created using version %s, "
+ "but the operating system provides version %s.",
+ collversion, actual_versionstr),
+ errhint("Build PostgreSQL with the right library version.")));
+ }
+
/* Make the locale settings visible as GUC variables, too */
- SetConfigOption("lc_collate", collate, PGC_INTERNAL, PGC_S_OVERRIDE);
+ SetConfigOption("lc_collate", datcollate, PGC_INTERNAL, PGC_S_OVERRIDE);
SetConfigOption("lc_ctype", ctype, PGC_INTERNAL, PGC_S_OVERRIDE);
+ pfree(collate);
+ if (collversion)
+ pfree(collversion);
+ if (langtag)
+ pfree(langtag);
+ if (actual_versionstr)
+ pfree(actual_versionstr);
+ if (wincollate)
+ pfree(wincollate);
+
check_strxfrm_bug();
ReleaseSysCache(tup);
diff --git a/src/backend/utils/mb/encnames.c b/src/backend/utils/mb/encnames.c
index 12b61cd3db..1e75257651 100644
--- a/src/backend/utils/mb/encnames.c
+++ b/src/backend/utils/mb/encnames.c
@@ -403,8 +403,6 @@ const pg_enc2gettext pg_enc2gettext_tbl[] =
};
-#ifndef FRONTEND
-
/*
* Table of encoding names for ICU
*
@@ -457,6 +455,7 @@ is_encoding_supported_by_icu(int encoding)
return (pg_enc2icu_tbl[encoding] != NULL);
}
+#ifndef FRONTEND
const char *
get_encoding_name_for_icu(int encoding)
{
@@ -475,7 +474,6 @@ get_encoding_name_for_icu(int encoding)
return icu_encoding_name;
}
-
#endif /* not FRONTEND */
diff --git a/src/bin/initdb/Makefile b/src/bin/initdb/Makefile
index 8c23941930..3733399888 100644
--- a/src/bin/initdb/Makefile
+++ b/src/bin/initdb/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) -I$(top_srcdir)/src/timezone $(CPPFLAGS)
# note: we need libpq only because fe_utils does
-LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
# use system timezone data?
ifneq (,$(with_system_tzdata))
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index ab5cb7f0c1..5dc569c3a1 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -55,6 +55,10 @@
#include <signal.h>
#include <time.h>
+#ifdef USE_ICU
+#include <unicode/uloc.h>
+#endif
+
#ifdef HAVE_SHM_OPEN
#include "sys/mman.h"
#endif
@@ -65,6 +69,7 @@
#include "catalog/pg_collation_d.h"
#include "common/file_perm.h"
#include "common/file_utils.h"
+#include "common/pg_collation_fn_common.h"
#include "common/restricted_token.h"
#include "common/username.h"
#include "fe_utils/string_utils.h"
@@ -144,6 +149,8 @@ static bool data_checksums = false;
static char *xlog_dir = NULL;
static char *str_wal_segment_size_mb = NULL;
static int wal_segment_size_mb;
+static char collprovider = '\0';
+static char *collversion = NULL;
/* internal vars */
@@ -269,10 +276,15 @@ static char *escape_quotes(const char *src);
static char *escape_quotes_bki(const char *src);
static int locale_date_order(const char *locale);
static void check_locale_name(int category, const char *locale,
- char **canonname);
-static bool check_locale_encoding(const char *locale, int encoding);
+ char **canonname, char collprovider);
+static bool check_locale_encoding(const char *locale, int encoding,
+ char collprovider);
static void setlocales(void);
static void usage(const char *progname);
+#ifdef USE_ICU
+static char *check_icu_locale_name(const char *locale);
+#endif
+static void set_collation_version(void);
void setup_pgdata(void);
void setup_bin_paths(const char *argv0);
void setup_data_file_paths(void);
@@ -1407,10 +1419,27 @@ bootstrap_template1(void)
char **bki_lines;
char headerline[MAXPGPATH];
char buf[64];
+ char *lc_collate_full_name;
printf(_("running bootstrap script ... "));
fflush(stdout);
+ Assert(lc_collate);
+
+ lc_collate_full_name = get_full_collation_name(lc_collate, collprovider,
+ collversion);
+
+ if (!lc_collate_full_name)
+ exit(1); /* get_full_collation_name printed the error */
+
+ if (strlen(lc_collate_full_name) >= NAMEDATALEN)
+ {
+ fprintf(stderr,
+ _("%s: the full collation name \"%s\" is too long\n"),
+ progname, lc_collate_full_name);
+ exit(1);
+ }
+
bki_lines = readfile(bki_file);
/* Check that bki file appears to be of the right version */
@@ -1452,7 +1481,7 @@ bootstrap_template1(void)
encodingid_to_string(encodingid));
bki_lines = replace_token(bki_lines, "LC_COLLATE",
- escape_quotes_bki(lc_collate));
+ escape_quotes_bki(lc_collate_full_name));
bki_lines = replace_token(bki_lines, "LC_CTYPE",
escape_quotes_bki(lc_ctype));
@@ -1494,6 +1523,7 @@ bootstrap_template1(void)
PG_CMD_CLOSE;
free(bki_lines);
+ free(lc_collate_full_name);
check_ok();
}
@@ -2244,53 +2274,143 @@ locale_date_order(const char *locale)
* the locale name, but typically it doesn't.)
*
* this should match the backend's check_locale() function
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
static void
-check_locale_name(int category, const char *locale, char **canonname)
+check_locale_name(int category, const char *locale, char **canonname,
+ char collprovider)
{
- char *save;
- char *res;
+ const char *save;
+ const char *res;
+ char *save_dup;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+ bool failure = false;
+#ifdef USE_ICU
+ UErrorCode status;
+ char *icu_locale;
+#endif
- if (canonname)
- *canonname = NULL; /* in case of failure */
+ Assert(use_libc || use_icu);
- save = setlocale(category, NULL);
- if (!save)
+#ifndef USE_ICU
+ if (use_icu)
{
- fprintf(stderr, _("%s: setlocale() failed\n"),
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
progname);
exit(1);
}
+#endif
+
+ if (canonname)
+ *canonname = NULL; /* in case of failure */
+
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ save = uloc_getDefault();
+ if (!save)
+ {
+ fprintf(stderr, _("%s: ICU error: uloc_getDefault() failed\n"),
+ progname);
+ exit(1);
+ }
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ save = setlocale(category, NULL);
+ if (!save)
+ {
+ fprintf(stderr, _("%s: setlocale() failed\n"),
+ progname);
+ exit(1);
+ }
+ }
/* save may be pointing at a modifiable scratch variable, so copy it. */
- save = pg_strdup(save);
+ save_dup = pg_strdup(save);
/* for setlocale() call */
if (!locale)
locale = "";
/* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ icu_locale = check_icu_locale_name(locale);
+ if (icu_locale == NULL && locale != NULL)
+ {
+ failure = true;
+ res = NULL;
+ }
+ else
+ {
+ status = U_ZERO_ERROR;
+ uloc_setDefault(icu_locale, &status);
+ res = uloc_getDefault();
+ failure = (U_FAILURE(status) || res == NULL);
+ if (icu_locale)
+ pfree(icu_locale);
+ }
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ res = setlocale(category, locale);
+ failure = (res == NULL);
+ }
/* save canonical name if requested. */
if (res && canonname)
*canonname = pg_strdup(res);
/* restore old value. */
- if (!setlocale(category, save))
+#ifdef USE_ICU
+ if (use_icu)
{
- fprintf(stderr, _("%s: failed to restore old locale \"%s\"\n"),
- progname, save);
- exit(1);
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ {
+ fprintf(stderr, _("%s: ICU error: failed to restore old locale \"%s\"\n"),
+ progname, save_dup);
+ exit(1);
+ }
}
- free(save);
+ else
+#endif
+ {
+ /* use_libc */
+ if (!setlocale(category, save_dup))
+ {
+ fprintf(stderr, _("%s: failed to restore old locale \"%s\"\n"),
+ progname, save_dup);
+ exit(1);
+ }
+ }
+ free(save_dup);
/* complain if locale wasn't valid */
- if (res == NULL)
+ if (failure)
{
if (*locale)
- fprintf(stderr, _("%s: invalid locale name \"%s\"\n"),
- progname, locale);
+ {
+ if (category == LC_COLLATE)
+ fprintf(stderr, _("%s: invalid locale name \"%s\" (provider \"%s\")\n"),
+ progname, locale, get_collprovider_name(collprovider));
+ else
+ fprintf(stderr, _("%s: invalid locale name \"%s\"\n"),
+ progname, locale);
+ }
else
{
/*
@@ -2312,9 +2432,11 @@ check_locale_name(int category, const char *locale, char **canonname)
* check if the chosen encoding matches the encoding required by the locale
*
* this should match the similar check in the backend createdb() function
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
static bool
-check_locale_encoding(const char *locale, int user_enc)
+check_locale_encoding(const char *locale, int user_enc, char collprovider)
{
int locale_enc;
@@ -2341,6 +2463,25 @@ check_locale_encoding(const char *locale, int user_enc)
progname);
return false;
}
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ if (!is_encoding_supported_by_icu(user_enc))
+ {
+ fprintf(stderr, _("%s: selected encoding (%s) is not supported for ICU locales\n"),
+ progname, pg_encoding_to_char(user_enc));
+ return false;
+ }
+#else /* not USE_ICU */
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
+ progname);
+ exit(1);
+#endif /* not USE_ICU */
+ }
+
return true;
}
@@ -2352,16 +2493,22 @@ check_locale_encoding(const char *locale, int user_enc)
static void
setlocales(void)
{
- char *canonname;
-
- /* set empty lc_* values to locale config if set */
+ char *canonname = NULL;
if (locale)
{
+ /*
+ * Set up the collation provider if possible and canonicalize the locale
+ * name.
+ */
+ check_locale_collprovider(locale, &canonname, &collprovider, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ locale = canonname;
+
+ /* set empty lc_* values to locale config if set */
if (!lc_ctype)
lc_ctype = locale;
- if (!lc_collate)
- lc_collate = locale;
if (!lc_numeric)
lc_numeric = locale;
if (!lc_time)
@@ -2372,29 +2519,83 @@ setlocales(void)
lc_messages = locale;
}
+ if (lc_collate)
+ {
+ /*
+ * Set up the collation provider if possible and canonicalize the locale
+ * name.
+ */
+ check_locale_collprovider(lc_collate, &canonname, &collprovider, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ lc_collate = canonname;
+ }
+ else if (canonname)
+ {
+ /* we have already canonicalized the locale name */
+ lc_collate = pstrdup(canonname);
+ }
+
/*
* canonicalize locale names, and obtain any missing values from our
* current environment
*/
- check_locale_name(LC_CTYPE, lc_ctype, &canonname);
+ check_locale_name(LC_CTYPE, lc_ctype, &canonname, '\0');
lc_ctype = canonname;
- check_locale_name(LC_COLLATE, lc_collate, &canonname);
+
+ /* we always check lc_collate for libc */
+ check_locale_name(LC_COLLATE, lc_collate, &canonname, COLLPROVIDER_LIBC);
+ if (lc_collate)
+ pfree(lc_collate);
lc_collate = canonname;
- check_locale_name(LC_NUMERIC, lc_numeric, &canonname);
+
+ /* determine the collation provider if we haven't already done it */
+ if (!is_valid_nondefault_collprovider(collprovider))
+ {
+#ifdef USE_ICU
+ if (!locale_is_c(lc_collate))
+ {
+ collprovider = COLLPROVIDER_ICU;
+ }
+ else
+#endif
+ {
+ collprovider = COLLPROVIDER_LIBC;
+ }
+ }
+
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ check_locale_name(LC_COLLATE, lc_collate, NULL, collprovider);
+ if (strcmp(lc_collate, lc_ctype) != 0)
+ {
+ fprintf(stderr,
+ _("%s: collations with different collate and ctype values are not supported by ICU\n"),
+ progname);
+ exit(1);
+ }
+ }
+
+ check_locale_name(LC_NUMERIC, lc_numeric, &canonname, '\0');
lc_numeric = canonname;
- check_locale_name(LC_TIME, lc_time, &canonname);
+ check_locale_name(LC_TIME, lc_time, &canonname, '\0');
lc_time = canonname;
- check_locale_name(LC_MONETARY, lc_monetary, &canonname);
+ check_locale_name(LC_MONETARY, lc_monetary, &canonname, '\0');
lc_monetary = canonname;
#if defined(LC_MESSAGES) && !defined(WIN32)
- check_locale_name(LC_MESSAGES, lc_messages, &canonname);
+ check_locale_name(LC_MESSAGES, lc_messages, &canonname, '\0');
lc_messages = canonname;
#else
/* when LC_MESSAGES is not available, use the LC_CTYPE setting */
- check_locale_name(LC_CTYPE, lc_messages, &canonname);
+ check_locale_name(LC_CTYPE, lc_messages, &canonname, '\0');
lc_messages = canonname;
#endif
+
+ set_collation_version();
}
/*
@@ -2612,6 +2813,9 @@ setup_locale_encoding(void)
lc_time);
}
+ printf(_("The default collation provider is \"%s\".\n"),
+ get_collprovider_name(collprovider));
+
if (!encoding)
{
int ctype_enc;
@@ -2662,8 +2866,8 @@ setup_locale_encoding(void)
else
encodingid = get_encoding_id(encoding);
- if (!check_locale_encoding(lc_ctype, encodingid) ||
- !check_locale_encoding(lc_collate, encodingid))
+ if (!check_locale_encoding(lc_ctype, encodingid, '\0') ||
+ !check_locale_encoding(lc_collate, encodingid, collprovider))
exit(1); /* check_locale_encoding printed the error */
}
@@ -3439,3 +3643,113 @@ main(int argc, char *argv[])
return 0;
}
+
+#ifdef USE_ICU
+/*
+ * If locale is "" return the environment value from setlocale().
+ *
+ * Otherwise return a malloc'd copy of locale if it is not NULL.
+ *
+ * This should match the backend's check_icu_locale() function.
+ */
+static char *
+check_icu_locale_name(const char *locale)
+{
+ char *canonname = NULL;
+ char *winlocale = NULL;
+ char *result;
+
+ /* Windows locales can be in the format ".codepage" */
+ if (locale && (strlen(locale) == 0 || locale[0] == '.'))
+ {
+ check_locale_name(LC_COLLATE, locale, &canonname, COLLPROVIDER_LIBC);
+ locale = (const char *) canonname;
+ }
+
+#ifdef WIN32
+ if (!locale_is_c(locale))
+ {
+ winlocale = check_icu_winlocale(locale);
+
+ if (winlocale == NULL && locale != NULL)
+ exit(1); /* check_icu_winlocale printed the error */
+ else
+ locale = winlocale;
+ }
+#endif
+
+ result = locale ? pstrdup(locale) : NULL;
+
+ if (canonname)
+ pfree(canonname);
+ if (winlocale)
+ pfree(winlocale);
+
+ return result;
+}
+#endif /* USE_ICU */
+
+/*
+ * Setup the lc_collate version (get it from the collation provider).
+ */
+static void
+set_collation_version(void)
+{
+ char *wincollate = NULL;
+ char *langtag = NULL;
+ const char *collate;
+ bool failure;
+
+ Assert(lc_collate);
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ collate = (const char *) lc_collate;
+
+#ifdef WIN32
+ if (!locale_is_c(collate))
+ {
+ wincollate = check_icu_winlocale(collate);
+
+ if (wincollate == NULL && collate != NULL)
+ exit(1); /* check_icu_winlocale printed the error */
+ else
+ collate = (const char *) wincollate;
+ }
+#endif /* WIN32 */
+
+ langtag = get_icu_language_tag(collate);
+ if (!langtag)
+ {
+ /* get_icu_language_tag printed the main error message */
+ fprintf(stderr, _("Rerun %s with a different locale selection.\n"),
+ progname);
+ exit(1);
+ }
+ collate = get_icu_collate(collate, langtag);
+#else /* not USE_ICU */
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
+ progname);
+ exit(1);
+#endif /* not USE_ICU */
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ collate = (const char *) lc_collate;
+ }
+
+ get_collation_actual_version(collprovider, collate, &collversion, &failure);
+ if (failure)
+ /* get_collation_actual_version printed the error */
+ exit(1);
+
+ if (langtag)
+ free(langtag);
+ if (wincollate)
+ free(wincollate);
+}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c8d01ed4a4..1e56d90260 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -47,12 +47,14 @@
#include "catalog/pg_attribute_d.h"
#include "catalog/pg_cast_d.h"
#include "catalog/pg_class_d.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_default_acl_d.h"
#include "catalog/pg_largeobject_d.h"
#include "catalog/pg_largeobject_metadata_d.h"
#include "catalog/pg_proc_d.h"
#include "catalog/pg_trigger_d.h"
#include "catalog/pg_type_d.h"
+#include "common/pg_collation_fn_common.h"
#include "libpq/libpq-fs.h"
#include "storage/block.h"
@@ -13299,9 +13301,10 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
int i_collprovider;
int i_collcollate;
int i_collctype;
- const char *collprovider;
+ const char *collproviderstr;
const char *collcollate;
const char *collctype;
+ const char *collprovider_name;
/* Skip if not to be dumped */
if (!collinfo->dobj.dump || dopt->dataOnly)
@@ -13339,28 +13342,28 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
i_collcollate = PQfnumber(res, "collcollate");
i_collctype = PQfnumber(res, "collctype");
- collprovider = PQgetvalue(res, 0, i_collprovider);
+ collproviderstr = PQgetvalue(res, 0, i_collprovider);
collcollate = PQgetvalue(res, 0, i_collcollate);
collctype = PQgetvalue(res, 0, i_collctype);
+ /*
+ * Use COLLPROVIDER_DEFAULT to allow dumping pg_catalog; not accepted on
+ * input
+ */
+ collprovider_name = get_collprovider_name(collproviderstr[0]);
+ if (!collprovider_name)
+ exit_horribly(NULL,
+ "unrecognized collation provider: %s\n",
+ collproviderstr);
+
+
appendPQExpBuffer(delq, "DROP COLLATION %s;\n",
fmtQualifiedDumpable(collinfo));
appendPQExpBuffer(q, "CREATE COLLATION %s (",
fmtQualifiedDumpable(collinfo));
- appendPQExpBufferStr(q, "provider = ");
- if (collprovider[0] == 'c')
- appendPQExpBufferStr(q, "libc");
- else if (collprovider[0] == 'i')
- appendPQExpBufferStr(q, "icu");
- else if (collprovider[0] == 'd')
- /* to allow dumping pg_catalog; not accepted on input */
- appendPQExpBufferStr(q, "default");
- else
- exit_horribly(NULL,
- "unrecognized collation provider: %s\n",
- collprovider);
+ appendPQExpBuffer(q, "provider = %s", collprovider_name);
if (strcmp(collcollate, collctype) == 0)
{
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4ca0db1d0c..483bcf208d 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -17,7 +17,9 @@
#include "catalog/pg_attribute_d.h"
#include "catalog/pg_cast_d.h"
#include "catalog/pg_class_d.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_default_acl_d.h"
+#include "common/pg_collation_fn_common.h"
#include "fe_utils/string_utils.h"
#include "common.h"
@@ -4094,7 +4096,13 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
if (pset.sversion >= 100000)
appendPQExpBuffer(&buf,
- ",\n CASE c.collprovider WHEN 'd' THEN 'default' WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\"",
+ ",\n CASE c.collprovider WHEN '%c' THEN '%s' WHEN '%c' THEN '%s' WHEN '%c' THEN '%s' END AS \"%s\"",
+ COLLPROVIDER_DEFAULT,
+ get_collprovider_name(COLLPROVIDER_DEFAULT),
+ COLLPROVIDER_LIBC,
+ get_collprovider_name(COLLPROVIDER_LIBC),
+ COLLPROVIDER_ICU,
+ get_collprovider_name(COLLPROVIDER_ICU),
gettext_noop("Provider"));
if (verbose)
diff --git a/src/bin/scripts/Makefile b/src/bin/scripts/Makefile
index 4c6e4b9395..1e80f0b842 100644
--- a/src/bin/scripts/Makefile
+++ b/src/bin/scripts/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
PROGRAMS = createdb createuser dropdb dropuser clusterdb vacuumdb reindexdb pg_isready
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
-LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
all: $(PROGRAMS)
diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index fc108882e4..816e3222f7 100644
--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -58,6 +58,7 @@ main(int argc, char *argv[])
char *lc_collate = NULL;
char *lc_ctype = NULL;
char *locale = NULL;
+ char *canonname = NULL;
PQExpBufferData sql;
@@ -153,7 +154,15 @@ main(int argc, char *argv[])
progname);
exit(1);
}
- lc_ctype = locale;
+
+ /*
+ * remove the collation provider modifier from the locale for lc_ctype
+ */
+ check_locale_collprovider(locale, &canonname, NULL, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ lc_ctype = canonname;
+
lc_collate = locale;
}
@@ -241,6 +250,9 @@ main(int argc, char *argv[])
PQfinish(conn);
+ if (canonname)
+ pfree(canonname);
+
exit(0);
}
diff --git a/src/common/Makefile b/src/common/Makefile
index ec8139f014..6017a250c6 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -48,7 +48,7 @@ OBJS_COMMON = base64.o config_info.o controldata_utils.o exec.o file_perm.o \
ip.o keywords.o link-canary.o md5.o pg_lzcompress.o \
pgfnames.o psprintf.o relpath.o \
rmtree.o saslprep.o scram-common.o string.o unicode_norm.o \
- username.o wait_error.o
+ username.o wait_error.o pg_collation_fn_common.o
ifeq ($(with_openssl),yes)
OBJS_COMMON += sha2_openssl.o
diff --git a/src/common/pg_collation_fn_common.c b/src/common/pg_collation_fn_common.c
new file mode 100644
index 0000000000..a3ba3a368d
--- /dev/null
+++ b/src/common/pg_collation_fn_common.c
@@ -0,0 +1,90 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_collation_fn_common.c
+ * commmon routines to support manipulation of the pg_collation relation
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/common/pg_collation_fn_common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifdef FRONTEND
+#include "postgres_fe.h"
+#else
+#include "postgres.h"
+#endif
+
+#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
+
+
+/*
+ * Note that we search the table with pg_strcasecmp(), so variant
+ * capitalizations don't need their own entries.
+ */
+typedef struct collprovider_name
+{
+ char collprovider;
+ const char *name;
+} collprovider_name;
+
+static const collprovider_name collprovider_name_tbl[] =
+{
+ {COLLPROVIDER_DEFAULT, "default"},
+ {COLLPROVIDER_LIBC, "libc"},
+ {COLLPROVIDER_ICU, "icu"},
+ {'\0', NULL} /* end marker */
+};
+
+/*
+ * Get the collation provider from the given collation provider name.
+ *
+ * Return '\0' if we can't determine it.
+ */
+char
+get_collprovider(const char *name)
+{
+ int i;
+
+ if (!name)
+ return '\0';
+
+ /* Check the table */
+ for (i = 0; collprovider_name_tbl[i].name; ++i)
+ if (pg_strcasecmp(name, collprovider_name_tbl[i].name) == 0)
+ return collprovider_name_tbl[i].collprovider;
+
+ return '\0';
+}
+
+/*
+ * Get the name of the given collation provider.
+ *
+ * Return NULL if we can't determine it.
+ */
+const char *
+get_collprovider_name(char collprovider)
+{
+ int i;
+
+ /* Check the table */
+ for (i = 0; collprovider_name_tbl[i].collprovider; ++i)
+ if (collprovider_name_tbl[i].collprovider == collprovider)
+ return collprovider_name_tbl[i].name;
+
+ return NULL;
+}
+
+/*
+ * Return true if collation provider is nondefault and valid, and false otherwise.
+ */
+bool
+is_valid_nondefault_collprovider(char collprovider)
+{
+ return (collprovider == COLLPROVIDER_LIBC ||
+ collprovider == COLLPROVIDER_ICU);
+}
diff --git a/src/fe_utils/.gitignore b/src/fe_utils/.gitignore
index 37f5f7514d..b14041b5cf 100644
--- a/src/fe_utils/.gitignore
+++ b/src/fe_utils/.gitignore
@@ -1 +1,2 @@
/psqlscan.c
+/pg_collation_fn_common.c
diff --git a/src/fe_utils/Makefile b/src/fe_utils/Makefile
index 5362cffd57..f6ffa0905b 100644
--- a/src/fe_utils/Makefile
+++ b/src/fe_utils/Makefile
@@ -19,7 +19,8 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
-OBJS = mbprint.o print.o psqlscan.o simple_list.o string_utils.o conditional.o
+OBJS = mbprint.o print.o psqlscan.o simple_list.o string_utils.o conditional.o \
+ pg_collation_fn_common.o
all: libpgfeutils.a
@@ -33,6 +34,13 @@ psqlscan.c: FLEX_FIX_WARNING=yes
distprep: psqlscan.c
+# Pull in pg_collation_fn_common.c from src/common. That exposes us to
+# risks of version skew if we link to a shared library. Do it the
+# hard way, instead, so that we're statically linked.
+
+pg_collation_fn_common.c: % : $(top_srcdir)/src/common/%
+ rm -f $@ && $(LN_S) $< .
+
# libpgfeutils could be useful to contrib, so install it
install: all installdirs
$(INSTALL_STLIB) libpgfeutils.a '$(DESTDIR)$(libdir)/libpgfeutils.a'
@@ -45,6 +53,7 @@ uninstall:
clean distclean:
rm -f libpgfeutils.a $(OBJS) lex.backup
+ rm -f pg_collation_fn_common.c
# psqlscan.c is supposed to be in the distribution tarball,
# so do not clean it in the clean/distclean rules
diff --git a/src/include/commands/dbcommands.h b/src/include/commands/dbcommands.h
index 677c7fc5fc..d1b27761d4 100644
--- a/src/include/commands/dbcommands.h
+++ b/src/include/commands/dbcommands.h
@@ -29,6 +29,7 @@ extern ObjectAddress AlterDatabaseOwner(const char *dbname, Oid newOwnerId);
extern Oid get_database_oid(const char *dbname, bool missingok);
extern char *get_database_name(Oid dbid);
-extern void check_encoding_locale_matches(int encoding, const char *collate, const char *ctype);
+extern void check_encoding_locale_matches(int encoding, const char *collate, const char *ctype,
+ char collprovider);
#endif /* DBCOMMANDS_H */
diff --git a/src/include/common/pg_collation_fn_common.h b/src/include/common/pg_collation_fn_common.h
new file mode 100644
index 0000000000..f05778dfad
--- /dev/null
+++ b/src/include/common/pg_collation_fn_common.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_collation_fn_common.h
+ * prototypes for functions in common/pg_collation_fn_common.c
+ *
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/pg_collation_fn_common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef PG_COLLATION_FN_COMMON_H
+#define PG_COLLATION_FN_COMMON_H
+
+extern char get_collprovider(const char *name);
+extern const char *get_collprovider_name(char collprovider);
+extern bool is_valid_nondefault_collprovider(char collprovider);
+
+#endif /* PG_COLLATION_FN_COMMON_H */
diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32
index f7a051d112..a0dd14ebb6 100644
--- a/src/include/pg_config.h.win32
+++ b/src/include/pg_config.h.win32
@@ -686,6 +686,10 @@
/* Define to use /dev/urandom for random number generation */
/* #undef USE_DEV_URANDOM */
+/* Define to build with ICU support. (--with-icu) */
+/* #undef USE_ICU */
+
+
/* Define to 1 to build with LDAP support. (--with-ldap) */
/* #undef USE_LDAP */
diff --git a/src/include/port.h b/src/include/port.h
index 3a53bcf2e4..1ddecce5c4 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -473,6 +473,40 @@ extern int pg_get_encoding_from_locale(const char *ctype, bool write_message);
extern int pg_codepage_to_encoding(UINT cp);
#endif
+/* do not make libpq with icu */
+#ifndef LIBPQ_MAKE
+
+extern void check_locale_collprovider(const char *locale, char **canonname,
+ char *collprovider, char **collversion);
+extern bool locale_is_c(const char *locale);
+extern char *get_full_collation_name(const char *locale, char collprovider,
+ const char *collversion);
+
+#ifdef FRONTEND
+extern void get_collation_actual_version(char collprovider,
+ const char *collcollate,
+ char **collversion, bool *failure);
+#else
+extern char *get_collation_actual_version(char collprovider,
+ const char *collcollate);
+#endif
+
+#ifdef USE_ICU
+#define ICU_ROOT_LOCALE "root"
+
+/* Users of this must import unicode/ucol.h too. */
+struct UCollator;
+extern struct UCollator *open_collator(const char *collate);
+
+extern char * get_icu_language_tag(const char *localename);
+extern const char *get_icu_collate(const char *locale, const char *langtag);
+#ifdef WIN32
+extern char * check_icu_winlocale(const char *winlocale);
+#endif /* WIN32 */
+#endif /* USE_ICU */
+
+#endif /* not LIBPQ_MAKE */
+
/* port/inet_net_ntop.c */
extern char *inet_net_ntop(int af, const void *src, int bits,
char *dst, size_t size);
diff --git a/src/include/port/win32.h b/src/include/port/win32.h
index 9f48a58aed..7e3e7e57e6 100644
--- a/src/include/port/win32.h
+++ b/src/include/port/win32.h
@@ -16,7 +16,7 @@
* get support for GetLocaleInfoEx() with locales. For everything else
* the minimum version is Windows XP (0x0501).
*/
-#if defined(_MSC_VER) && _MSC_VER >= 1900
+#if defined(_MSC_VER) && _MSC_VER >= 1800
#define MIN_WINNT 0x0600
#else
#define MIN_WINNT 0x0501
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 88a3134862..161a14ef61 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -57,8 +57,10 @@ extern void assign_locale_numeric(const char *newval, void *extra);
extern bool check_locale_time(char **newval, void **extra, GucSource source);
extern void assign_locale_time(const char *newval, void *extra);
-extern bool check_locale(int category, const char *locale, char **canonname);
-extern char *pg_perm_setlocale(int category, const char *locale);
+extern bool check_locale(int category, const char *locale, char **canonname,
+ char collprovider);
+extern const char *pg_perm_setlocale(int category, const char *locale,
+ char collprovider);
extern void check_strxfrm_bug(void);
extern bool lc_collate_is_c(Oid collation);
@@ -102,11 +104,11 @@ typedef struct pg_locale_struct *pg_locale_t;
extern pg_locale_t pg_newlocale_from_collation(Oid collid);
-extern char *get_collation_actual_version(char collprovider, const char *collcollate);
-
#ifdef USE_ICU
extern int32_t icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes);
extern int32_t icu_from_uchar(char **result, const UChar *buff_uchar, int32_t len_uchar);
+extern const char *get_icu_default_collate(void);
+extern UCollator *get_default_collation_collator(void);
#endif
/* These functions convert from/to libc's wchar_t, *not* pg_wchar_t */
@@ -115,4 +117,6 @@ extern size_t wchar2char(char *to, const wchar_t *from, size_t tolen,
extern size_t char2wchar(wchar_t *to, size_t tolen,
const char *from, size_t fromlen, pg_locale_t locale);
+extern char get_default_collprovider(void);
+
#endif /* _PG_LOCALE_ */
diff --git a/src/interfaces/libpq/.gitignore b/src/interfaces/libpq/.gitignore
index 9be338dec8..7d2e5ce92e 100644
--- a/src/interfaces/libpq/.gitignore
+++ b/src/interfaces/libpq/.gitignore
@@ -3,3 +3,4 @@
# .c files that are symlinked in from elsewhere
/encnames.c
/wchar.c
+/pg_collation_fn_common.c
diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile
index c2171d0856..21b5205bb2 100644
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -19,10 +19,11 @@ NAME= pq
SO_MAJOR_VERSION= 5
SO_MINOR_VERSION= $(MAJORVERSION)
-override CPPFLAGS := -DFRONTEND -DUNSAFE_STAT_OK -I$(srcdir) $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port
+override CPPFLAGS := -DFRONTEND -DUNSAFE_STAT_OK -I$(srcdir) $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port -DLIBPQ_MAKE
ifneq ($(PORTNAME), win32)
override CFLAGS += $(PTHREAD_CFLAGS)
endif
+LDFLAGS_INTERNAL += $(ICU_LIBS)
# The MSVC build system scrapes OBJS from this file. If you change any of
# the conditional additions of files to OBJS, update Mkvcbuild.pm to match.
diff --git a/src/port/chklocale.c b/src/port/chklocale.c
index dde913099f..a30bded981 100644
--- a/src/port/chklocale.c
+++ b/src/port/chklocale.c
@@ -23,8 +23,26 @@
#include <langinfo.h>
#endif
+#ifdef USE_ICU
+#include <unicode/ucol.h>
+#endif
+
+#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
+/*
+ * In backend, we will use palloc/pfree. In frontend, use malloc/free.
+ */
+#ifndef FRONTEND
+#define STRDUP(s) pstrdup(s)
+#define ALLOC(size) palloc(size)
+#define FREE(s) pfree(s)
+#else
+#define STRDUP(s) strdup(s)
+#define ALLOC(size) malloc(size)
+#define FREE(s) free(s)
+#endif
/*
* This table needs to recognize all the CODESET spellings for supported
@@ -436,3 +454,583 @@ pg_get_encoding_from_locale(const char *ctype, bool write_message)
}
#endif /* (HAVE_LANGINFO_H && CODESET) || WIN32 */
+
+/* do not make libpq with icu */
+#ifndef LIBPQ_MAKE
+
+/*
+ * Check if the locale contains the modifier of the collation provider.
+ *
+ * Set up the collation provider according to the appropriate modifier or '\0'.
+ * Set up the collation version to NULL if we don't find it after the collation
+ * provider modifier.
+ *
+ * The malloc'd copy of the locale's canonical name without the modifier of the
+ * collation provider and the collation version is stored in the canonname if
+ * locale is not NULL. The canoname can have the zero length.
+ */
+void
+check_locale_collprovider(const char *locale, char **canonname,
+ char *collprovider, char **collversion)
+{
+ const char *modifier_sign,
+ *dot_sign,
+ *cur_collprovider_end;
+ char cur_collprovider_name[NAMEDATALEN];
+ int cur_collprovider_len;
+ char cur_collprovider;
+
+ /* in case of failure or if we don't find them in the locale name */
+ if (canonname)
+ *canonname = NULL;
+ if (collprovider)
+ *collprovider = '\0';
+ if (collversion)
+ *collversion = NULL;
+
+ if (!locale)
+ return;
+
+ /* find the last occurrence of the modifier sign '@' in the locale */
+ modifier_sign = strrchr(locale, '@');
+
+ if (!modifier_sign)
+ {
+ /* just copy all the name */
+ if (canonname)
+ *canonname = STRDUP(locale);
+ return;
+ }
+
+ /* check if there's a version after the collation provider modifier */
+ if ((dot_sign = strchr(modifier_sign, '.')) == NULL)
+ cur_collprovider_end = &locale[strlen(locale)];
+ else
+ cur_collprovider_end = dot_sign;
+
+ cur_collprovider_len = cur_collprovider_end - modifier_sign - 1;
+ if (cur_collprovider_len + 1 > NAMEDATALEN)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("collation provider name is too long: %s"), locale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("collation provider name is too long: %s", locale)));
+#endif /* not FRONTEND */
+ return;
+ }
+
+ strncpy(cur_collprovider_name, modifier_sign + 1, cur_collprovider_len);
+ cur_collprovider_name[cur_collprovider_len] = '\0';
+
+ /* check if this is a valid collprovider name */
+ cur_collprovider = get_collprovider(cur_collprovider_name);
+ if (is_valid_nondefault_collprovider(cur_collprovider))
+ {
+ if (collprovider)
+ *collprovider = cur_collprovider;
+
+ if (canonname)
+ {
+ int canonname_len = modifier_sign - locale;
+
+ *canonname = ALLOC((canonname_len + 1) * sizeof(char));
+ if (*canonname)
+ {
+ strncpy(*canonname, locale, canonname_len);
+ (*canonname)[canonname_len] = '\0';
+ }
+ else
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("out of memory"));
+ /*
+ * keep newline separate so there's only one translatable string
+ */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR, (errmsg("out of memory")));
+#endif /* not FRONTEND */
+ }
+ }
+
+ if (dot_sign && collversion)
+ *collversion = STRDUP(dot_sign + 1);
+ }
+ else
+ {
+ /* just copy all the name */
+ if (canonname)
+ *canonname = STRDUP(locale);
+ }
+}
+
+/*
+ * Return true if locale is "C" or "POSIX";
+ */
+bool
+locale_is_c(const char *locale)
+{
+ return locale && (strcmp(locale, "C") == 0 || strcmp(locale, "POSIX") == 0);
+}
+
+/*
+ * Return locale ended with collation provider modifier and collation version.
+ *
+ * Return NULL if locale is NULL.
+ */
+char *
+get_full_collation_name(const char *locale, char collprovider,
+ const char *collversion)
+{
+ char *new_locale;
+ int old_len,
+ len_with_provider,
+ new_len;
+ const char *collprovider_name;
+
+ if (!locale)
+ return NULL;
+
+ collprovider_name = get_collprovider_name(collprovider);
+ Assert(collprovider_name);
+
+ old_len = strlen(locale);
+ new_len = len_with_provider = old_len + 1 + strlen(collprovider_name);
+ if (collversion && *collversion)
+ new_len += 1 + strlen(collversion);
+
+ new_locale = ALLOC((new_len + 1) * sizeof(char));
+ if (!new_locale)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("out of memory"));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR, (errmsg("out of memory")));
+#endif /* not FRONTEND */
+
+ return NULL;
+ }
+
+ /* add the collation provider modifier */
+ strcpy(new_locale, locale);
+ new_locale[old_len] = '@';
+ strcpy(&new_locale[old_len + 1], collprovider_name);
+
+ /* add the collation version if needed */
+ if (collversion && *collversion)
+ {
+ new_locale[len_with_provider] = '.';
+ strcpy(&new_locale[len_with_provider + 1], collversion);
+ }
+
+ new_locale[new_len] = '\0';
+
+ return new_locale;
+}
+
+/*
+ * Get provider-specific collation version string for the given collation from
+ * the operating system/library.
+ *
+ * A particular provider must always either return a non-NULL string or return
+ * NULL (if it doesn't support versions). It must not return NULL for some
+ * collcollate and not NULL for others.
+ */
+#ifdef FRONTEND
+void
+get_collation_actual_version(char collprovider, const char *collcollate,
+ char **collversion, bool *failure)
+{
+ if (failure)
+ *failure = false;
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ UCollator *collator = open_collator(collcollate);
+ UVersionInfo versioninfo;
+ char buf[U_MAX_VERSION_STRING_LENGTH];
+
+ if (collator)
+ {
+ ucol_getVersion(collator, versioninfo);
+ ucol_close(collator);
+
+ u_versionToString(versioninfo, buf);
+ if (collversion)
+ *collversion = STRDUP(buf);
+ }
+ else
+ {
+ if (collversion)
+ *collversion = NULL;
+ if (failure)
+ *failure = true;
+ }
+ }
+ else
+#endif
+ {
+ if (collversion)
+ *collversion = NULL;
+ }
+}
+#else /* not FRONTEND */
+char *
+get_collation_actual_version(char collprovider, const char *collcollate)
+{
+ char *collversion;
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ UCollator *collator = open_collator(collcollate);
+ UVersionInfo versioninfo;
+ char buf[U_MAX_VERSION_STRING_LENGTH];
+
+ ucol_getVersion(collator, versioninfo);
+ ucol_close(collator);
+
+ u_versionToString(versioninfo, buf);
+ collversion = STRDUP(buf);
+ }
+ else
+#endif
+ collversion = NULL;
+
+ return collversion;
+}
+#endif /* not FRONTEND */
+
+#ifdef USE_ICU
+/*
+ * Open the collator for this icu locale. Return NULL in case of failure.
+ */
+UCollator *
+open_collator(const char *collate)
+{
+ UCollator *collator;
+ UErrorCode status;
+ const char *save = uloc_getDefault();
+ char *save_dup;
+
+ if (!save)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: uloc_getDefault() failed"));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR, (errmsg("ICU error: uloc_getDefault() failed")));
+#endif
+ return NULL;
+ }
+
+ /* save may be pointing at a modifiable scratch variable, so copy it. */
+ save_dup = STRDUP(save);
+
+ /* set the default locale to root */
+ status = U_ZERO_ERROR;
+ uloc_setDefault(ICU_ROOT_LOCALE, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: failed to set the default locale to \"%s\": %s"),
+ ICU_ROOT_LOCALE, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to set the default locale to \"%s\": %s",
+ ICU_ROOT_LOCALE, u_errorName(status))));
+#endif
+ return NULL;
+ }
+
+ /* get a collator for this collate */
+ status = U_ZERO_ERROR;
+ collator = ucol_open(collate, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: could not open collator for locale \"%s\": %s"),
+ collate, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: could not open collator for locale \"%s\": %s",
+ collate, u_errorName(status))));
+#endif
+ collator = NULL;
+ }
+
+ /* restore old value of the default locale. */
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: failed to restore old locale \"%s\": %s"),
+ save_dup, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to restore old locale \"%s\": %s",
+ save_dup, u_errorName(status))));
+#endif
+ return NULL;
+ }
+ FREE(save_dup);
+
+ return collator;
+}
+
+/*
+ * Get the ICU language tag for a locale name.
+ * The result is a palloc'd string.
+ * Return NULL in case of failure or if localename is NULL.
+ */
+char *
+get_icu_language_tag(const char *localename)
+{
+ char buf[ULOC_FULLNAME_CAPACITY];
+ UErrorCode status = U_ZERO_ERROR;
+
+ if (!localename)
+ return NULL;
+
+ uloc_toLanguageTag(localename, buf, sizeof(buf), TRUE, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("ICU error: could not convert locale name \"%s\" to language tag: %s"),
+ localename, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: could not convert locale name \"%s\" to language tag: %s",
+ localename, u_errorName(status))));
+#endif
+ return NULL;
+ }
+ return STRDUP(buf);
+}
+
+/*
+ * Get the icu collation name.
+ */
+const char *
+get_icu_collate(const char *locale, const char *langtag)
+{
+ return U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : locale;
+}
+
+#ifdef WIN32
+/*
+ * Get the Language Code Identifier (LCID) for the Windows locale.
+ *
+ * Return zero in case of failure.
+ */
+static uint32
+get_lcid(const wchar_t *winlocale)
+{
+ /*
+ * The second argument to the LocaleNameToLCID function is:
+ * - Prior to Windows 7: reserved; should always be 0.
+ * - Beginning in Windows 7: use LOCALE_ALLOW_NEUTRAL_NAMES to allow the
+ * return of lcids of locales without regions.
+ */
+#if (NTDDI_VERSION >= NTDDI_WIN7)
+ return LocaleNameToLCID(winlocale, LOCALE_ALLOW_NEUTRAL_NAMES);
+#else
+ return LocaleNameToLCID(winlocale, 0);
+#endif
+}
+
+/*
+ * char2wchar_ascii --- convert multibyte characters to wide characters
+ *
+ * This is a simplified version of the char2wchar() function from backend.
+ */
+static size_t
+char2wchar_ascii(wchar_t *to, size_t tolen, const char *from, size_t fromlen)
+{
+ size_t result;
+
+ if (tolen == 0)
+ return 0;
+
+ /* Win32 API does not work for zero-length input */
+ if (fromlen == 0)
+ result = 0;
+ else
+ {
+ result = MultiByteToWideChar(CP_ACP, 0, from, fromlen, to, tolen - 1);
+ /* A zero return is failure */
+ if (result == 0)
+ result = -1;
+ }
+
+ if (result != -1)
+ {
+ Assert(result < tolen);
+ /* Append trailing null wchar (MultiByteToWideChar() does not) */
+ to[result] = 0;
+ }
+
+ return result;
+}
+
+/*
+ * Get the canonical ICU name for the Windows locale.
+ *
+ * Return a malloc'd string or NULL in case of failure.
+ */
+char *
+check_icu_winlocale(const char *winlocale)
+{
+ uint32 lcid;
+ char canonname_buf[ULOC_FULLNAME_CAPACITY];
+ UErrorCode status = U_ZERO_ERROR;
+#if (_MSC_VER >= 1400) /* VC8.0 or later */
+ _locale_t loct = NULL;
+#endif
+
+ if (winlocale == NULL)
+ return NULL;
+
+ /* Get the Language Code Identifier (LCID). */
+
+#if (_MSC_VER >= 1400) /* VC8.0 or later */
+ loct = _create_locale(LC_COLLATE, winlocale);
+
+ if (loct != NULL)
+ {
+#if (_MSC_VER >= 1700) /* Visual Studio 2012 or later */
+ if ((lcid = get_lcid(loct->locinfo->locale_name[LC_COLLATE])) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif /* not FRONTEND */
+ _free_locale(loct);
+ return NULL;
+ }
+#else /* _MSC_VER >= 1400 && _MSC_VER < 1700 */
+ if ((lcid = loct->locinfo->lc_handle[LC_COLLATE]) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif /* not FRONTEND */
+ _free_locale(loct);
+ return NULL;
+ }
+#endif /* _MSC_VER >= 1400 && _MSC_VER < 1700 */
+ _free_locale(loct);
+ }
+ else
+#endif /* VC8.0 or later */
+ {
+ if (strlen(winlocale) == 0)
+ {
+ lcid = LOCALE_USER_DEFAULT;
+ }
+ else
+ {
+ size_t locale_len = strlen(winlocale);
+ wchar_t *wlocale = (wchar_t*) ALLOC(
+ (locale_len + 1) * sizeof(wchar_t));
+ /* Locale names use only ASCII */
+ size_t locale_wlen = char2wchar_ascii(wlocale, locale_len + 1,
+ winlocale, locale_len);
+ if (locale_wlen == -1)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to convert locale \"%s\" to wide characters"),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("failed to convert locale \"%s\" to wide characters",
+ winlocale)));
+#endif
+ FREE(wlocale);
+ return NULL;
+ }
+
+ if ((lcid = get_lcid(wlocale)) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif
+ FREE(wlocale);
+ return NULL;
+ }
+
+ FREE(wlocale);
+ }
+ }
+
+ /* Get the ICU canoname. */
+
+ uloc_getLocaleForLCID(lcid, canonname_buf, sizeof(canonname_buf), &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("ICU error: failed to get the locale name for LCID 0x%04x: %s"),
+ lcid, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to get the locale name for LCID 0x%04x: %s",
+ lcid, u_errorName(status))));
+#endif
+ return NULL;
+ }
+
+ return STRDUP(canonname_buf);
+}
+#endif /* WIN32 */
+#endif /* USE_ICU */
+
+#endif /* not LIBPQ_MAKE */
diff --git a/src/test/Makefile b/src/test/Makefile
index efb206aa75..d74c6150ef 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
top_builddir = ../..
include $(top_builddir)/src/Makefile.global
-SUBDIRS = perl regress isolation modules authentication recovery subscription
+SUBDIRS = perl regress isolation modules authentication recovery subscription default_collation
# Test suites that are not safe by default but can be run if selected
# by the user via the whitespace-separated list in variable
diff --git a/src/test/default_collation/Makefile b/src/test/default_collation/Makefile
new file mode 100644
index 0000000000..2efe8becb7
--- /dev/null
+++ b/src/test/default_collation/Makefile
@@ -0,0 +1,28 @@
+# src/test/default_collation/Makefile
+
+subdir = src/test/default_collation
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+ifeq ($(with_icu),yes)
+check:
+ $(MAKE) -C icu check
+check-utf8:
+ $(MAKE) -C icu.utf8 check
+ $(MAKE) -C libc.utf8 check
+else
+check:
+ $(MAKE) -C libc check
+check-utf8:
+ $(MAKE) -C libc.utf8 check
+endif
+
+# We don't check libc/ if with_icu or vice versa, but we do want "make clean" to
+# recurse into it. The same goes for libc.utf8/ or icu.utf8/, which we don't
+# check by default.
+ALWAYS_SUBDIRS = libc libc.utf8 icu icu.utf8
+
+clean distclean maintainer-clean:
+ for d in $(ALWAYS_SUBDIRS); do \
+ $(MAKE) -C $$d clean || exit; \
+ done
diff --git a/src/test/default_collation/icu.utf8/.gitignore b/src/test/default_collation/icu.utf8/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/icu.utf8/Makefile b/src/test/default_collation/icu.utf8/Makefile
new file mode 100644
index 0000000000..7adecfd240
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/icu.utf8/Makefile
+
+subdir = src/test/default_collation/icu.utf8
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/icu.utf8/t/001_default_collation.pl b/src/test/default_collation/icu.utf8/t/001_default_collation.pl
new file mode 100644
index 0000000000..617c06d2d7
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/t/001_default_collation.pl
@@ -0,0 +1,799 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 188;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"$expected_collprovider\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+sub psql
+{
+ my ($command, $db) = @_;
+ my ($result, $in, $out, $err);
+ my @psql = ('psql', '-X', '-c', $command);
+ if (defined($db))
+ {
+ push(@psql, $db);
+ }
+ print "# Running: " . join(" ", @psql) . "\n";
+ $result = IPC::Run::run \@psql, \$in, \$out, \$err;
+ ($result, $out, $err);
+}
+
+# --locale
+
+test_initdb(
+ "en_US.utf8 locale",
+ "--locale=en_US.utf8",
+ "icu",
+ "");
+
+test_initdb(
+ "en_US.utf8 locale with C ctype",
+ "--locale=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_initdb(
+ "be_BY\@latin icu locale",
+ "--locale=be_BY\@latin\@icu",
+ "icu",
+ "");
+
+test_initdb(
+ "be_BY\@latin icu locale invalid modifier order",
+ "--locale=be_BY\@icu\@latin",
+ "",
+ "invalid locale name \"be_BY\@icu\@latin\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_initdb(
+ "en_US.utf8 lc_collate",
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8",
+ "icu",
+ "");
+
+test_initdb(
+ "en_US.utf8 lc_collate with C ctype",
+ "--lc-collate=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_initdb(
+ "be_BY\@latin icu lc_collate",
+ "--lc-collate=be_BY\@latin\@icu --lc-ctype=be_BY\@latin",
+ "icu",
+ "");
+
+test_initdb(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@icu\@latin",
+ "",
+ "invalid locale name \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb",
+ split(" ", $options),
+ "--template=template0",
+ "mydb");
+
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name,
+ $options,
+ $expected_collprovider,
+ $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ ($result, $out_command, $err_command) = psql(
+ "create database mydb "
+ . $options
+ . " template = template0;");
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_default_collation
+{
+ my ($createdb_options, $collation, $expected_collprovider, @commands) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb", split(" ", $createdb_options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "\"@command\" check output");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "\"@command\" check output");
+ }
+
+ for (my $row = 0; $row <= $#commands; $row++)
+ {
+ my ($command_text, $expected) = @{$commands[$row]};
+ ($result, $out_command, $err_command) = psql($command_text, "mydb");
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ if ($out_command)
+ {
+ is(
+ $out_command,
+ $expected,
+ "default collation "
+ . $collation
+ . ": \""
+ . $command_text
+ . "\" check output");
+ }
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+my @command = ("createuser --createdb --no-superuser non_superuser");
+print "# Running: " . join(" ", @command) . "\n";
+system(@command);
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "en_US.utf8 locale",
+ "--locale=en_US.utf8",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu locale",
+ "--locale=be_BY\@latin\@icu",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu locale invalid modifier order",
+ "--locale=be_BY\@icu\@latin",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_createdb(
+ "en_US.utf8 lc_collate",
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8",
+ "icu",
+ "");
+
+test_createdb(
+ "en_US.utf8 lc_collate with C ctype",
+ "--lc-collate=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_createdb(
+ "be_BY\@latin icu lc_collate",
+ "--lc-collate=be_BY\@latin\@icu --lc-ctype=be_BY\@latin",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@icu\@latin",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+test_create_database(
+ "en_US.utf8 lc_collate",
+ "LC_COLLATE = 'en_US.utf8' LC_CTYPE = 'en_US.utf8'",
+ "icu",
+ "");
+
+test_create_database(
+ "en_US.utf8 lc_collate with C ctype",
+ "LC_COLLATE = 'en_US.utf8' LC_CTYPE = 'C'",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_create_database(
+ "be_BY\@latin icu lc_collate",
+ "LC_COLLATE = 'be_BY\@latin' LC_CTYPE = 'be_BY\@latin'",
+ "icu",
+ "");
+
+test_create_database(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "LC_COLLATE = 'be_BY\@icu\@latin'",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test default collation behaviour
+# use commands and outputs from the regression test collate.icu.utf8
+
+test_default_collation(
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8 --template=template0",
+ "en_US.utf8\@icu",
+ "icu",
+ (
+ [
+ "CREATE TABLE collate_test1 (a int, b text NOT NULL);",
+ "CREATE TABLE\n"
+ ],
+ [
+ "INSERT INTO collate_test1 VALUES "
+ . "(1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');",
+ "INSERT 0 4\n"],
+ [
+ "SELECT * FROM collate_test1 WHERE b >= 'bbc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 3 | bbc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # star expansion
+ [
+ "SELECT * FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # upper/lower
+ ["CREATE TABLE collate_test10 (a int, x text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test10 VALUES (1, 'hij'), (2, 'HIJ');",
+ "INSERT 0 2\n"
+ ],
+ [
+ "SELECT a, lower(x), upper(x), initcap(x) FROM collate_test10;",
+ " a | lower | upper | initcap \n"
+ . "---+-------+-------+---------\n"
+ . " 1 | hij | HIJ | Hij\n"
+ . " 2 | hij | HIJ | Hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ # LIKE/ILIKE
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ILIKE '%KI%' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ILIKE 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ # regular expressions
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE TABLE collate_test6 (a int, b text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test6 VALUES "
+ . "(1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'), "
+ . "(5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, ' '), "
+ . "(9, 'äbç'), (10, 'ÄBÇ');",
+ "INSERT 0 10\n"
+ ],
+ [
+ "SELECT b, "
+ . "b ~ '^[[:alpha:]]+\$' AS is_alpha, "
+ . "b ~ '^[[:upper:]]+\$' AS is_upper, "
+ . "b ~ '^[[:lower:]]+\$' AS is_lower, "
+ . "b ~ '^[[:digit:]]+\$' AS is_digit, "
+ . "b ~ '^[[:alnum:]]+\$' AS is_alnum, "
+ . "b ~ '^[[:graph:]]+\$' AS is_graph, "
+ . "b ~ '^[[:print:]]+\$' AS is_print, "
+ . "b ~ '^[[:punct:]]+\$' AS is_punct, "
+ . "b ~ '^[[:space:]]+\$' AS is_space "
+ . "FROM collate_test6;",
+ " b | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space \n"
+ . "-----+----------+----------+----------+----------+----------+----------+----------+----------+----------\n"
+ . " abc | t | f | t | f | t | t | t | f | f\n"
+ . " ABC | t | t | f | f | t | t | t | f | f\n"
+ . " 123 | f | f | f | t | t | t | t | f | f\n"
+ . " ab1 | f | f | f | f | t | t | t | f | f\n"
+ . " a1! | f | f | f | f | f | t | t | f | f\n"
+ . " a c | f | f | f | f | f | f | t | f | f\n"
+ . " !.; | f | f | f | f | f | t | t | t | f\n"
+ . " | f | f | f | f | f | f | t | f | t\n"
+ . " äbç | t | f | t | f | t | t | t | f | f\n"
+ . " ÄBÇ | t | t | f | f | t | t | t | f | f\n"
+ . "(10 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ~* 'KI' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ~* 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(coalesce(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;",
+ " a | b | greatest \n"
+ . "---+-----+----------\n"
+ . " 1 | abc | CCC\n"
+ . " 2 | äbc | CCC\n"
+ . " 3 | bbc | CCC\n"
+ . " 4 | ABC | CCC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, x, lower(greatest(x, 'foo')) FROM collate_test10;",
+ " a | x | lower \n"
+ . "---+-----+-------\n"
+ . " 1 | hij | hij\n"
+ . " 2 | HIJ | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;",
+ " a | nullif \n"
+ . "---+--------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 1 | \n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(nullif(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END "
+ . "FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 1 | abcd\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE DOMAIN testdomain AS text;", "CREATE DOMAIN\n", ""],
+ [
+ "SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(x::testdomain) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT min(b), max(b) FROM collate_test1;",
+ " min | max \n"
+ . "-----+-----\n"
+ . " abc | bbc\n"
+ . "(1 row)\n"
+ . "\n",
+ ""
+ ],
+ [
+ "SELECT array_agg(b ORDER BY b) FROM collate_test1;",
+ " array_agg \n"
+ . "-------------------\n"
+ . " {abc,ABC,äbc,bbc}\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 "
+ . "UNION ALL "
+ . "SELECT a, b FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 3 | bbc\n"
+ . "(8 rows)\n"
+ . "\n"
+ ],
+ # casting
+ [
+ "SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # propagation of collation in SQL functions (inlined and non-inlined
+ # cases) and plpgsql functions too
+ [
+ "CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_noninline (text, text) "
+ . "RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 limit 1 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_plpgsql (text, text) "
+ . "RETURNS boolean LANGUAGE plpgsql "
+ . "AS \$\$ begin return \$1 < \$2; end \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a.b AS a, b.b AS b, a.b < b.b AS lt, "
+ . "mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b) "
+ . "FROM collate_test1 a, collate_test1 b "
+ . "ORDER BY a.b, b.b;",
+ " a | b | lt | mylt | mylt_noninline | mylt_plpgsql \n"
+ . "-----+-----+----+------+----------------+--------------\n"
+ . " abc | abc | f | f | f | f\n"
+ . " abc | ABC | t | t | t | t\n"
+ . " abc | äbc | t | t | t | t\n"
+ . " abc | bbc | t | t | t | t\n"
+ . " ABC | abc | f | f | f | f\n"
+ . " ABC | ABC | f | f | f | f\n"
+ . " ABC | äbc | t | t | t | t\n"
+ . " ABC | bbc | t | t | t | t\n"
+ . " äbc | abc | f | f | f | f\n"
+ . " äbc | ABC | f | f | f | f\n"
+ . " äbc | äbc | f | f | f | f\n"
+ . " äbc | bbc | t | t | t | t\n"
+ . " bbc | abc | f | f | f | f\n"
+ . " bbc | ABC | f | f | f | f\n"
+ . " bbc | äbc | f | f | f | f\n"
+ . " bbc | bbc | f | f | f | f\n"
+ . "(16 rows)\n"
+ . "\n"
+ ],
+ # polymorphism
+ [
+ "SELECT * FROM unnest("
+ . "(SELECT array_agg(b ORDER BY b) FROM collate_test1)"
+ . ") ORDER BY 1;",
+ " unnest \n"
+ . "--------\n"
+ . " abc\n"
+ . " ABC\n"
+ . " äbc\n"
+ . " bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "CREATE FUNCTION dup (anyelement) RETURNS anyelement "
+ . "AS 'select \$1' LANGUAGE sql;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a, dup(b) FROM collate_test1 ORDER BY 2;",
+ " a | dup \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # indexes
+ [
+ "CREATE INDEX collate_test1_idx1 ON collate_test1 (b);",
+ "CREATE INDEX\n"
+ ]
+ )
+);
+
+$node->stop;
diff --git a/src/test/default_collation/icu/.gitignore b/src/test/default_collation/icu/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/default_collation/icu/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/icu/Makefile b/src/test/default_collation/icu/Makefile
new file mode 100644
index 0000000000..5ee91d8eaf
--- /dev/null
+++ b/src/test/default_collation/icu/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/icu/Makefile
+
+subdir = src/test/default_collation/icu
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/icu/t/001_default_collation.pl b/src/test/default_collation/icu/t/001_default_collation.pl
new file mode 100644
index 0000000000..8b58be3fa5
--- /dev/null
+++ b/src/test/default_collation/icu/t/001_default_collation.pl
@@ -0,0 +1,605 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# check whether ICU can convert C locale to a language tag
+
+my ($in_initdb, $out_initdb, $err_initdb);
+my @command = (qw(initdb -A trust -N -D), $datadir, "--locale=C\@icu");
+print "# Running: " . join(" ", @command) . "\n";
+my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb, \$err_initdb;
+
+my $c_to_icu_language_tag = (
+ not $err_initdb =~ /ICU error: could not convert locale name "C" to language tag: U_ILLEGAL_ARGUMENT_ERROR/);
+
+# get the number of tests
+
+plan tests => $c_to_icu_language_tag ? 124 : 110;
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"$expected_collprovider\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+# --locale
+
+test_initdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "libc",
+ "");
+
+test_initdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX libc locale",
+ "--locale=POSIX\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+test_initdb(
+ "POSIX icu locale",
+ "--locale=POSIX\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+test_initdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "",
+ "invalid locale name \"C\@icu\"");
+
+test_initdb(
+ "ICU language tag format locale",
+ "--locale=und-x-icu",
+ "",
+ "invalid locale name \"und-x-icu\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_initdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ "libc",
+ "");
+
+test_initdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX libc lc_collate",
+ "--lc-collate=POSIX\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu --lc-ctype=C",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));;
+
+test_initdb(
+ "POSIX icu lc_collate",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));;
+
+test_initdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "",
+ "invalid locale name \"C\@icu\"");
+
+test_initdb(
+ "ICU language tag format lc_collate",
+ "--lc-collate=und-x-icu",
+ "",
+ "invalid locale name \"und-x-icu\"");
+
+# --locale & --lc-collate
+
+test_initdb(
+ "lc_collate implicit provider takes precedence",
+ "--locale=\@icu --lc-collate=C",
+ "libc",
+ "");
+
+test_initdb(
+ "lc_collate explicit provider takes precedence",
+ "--locale=C\@libc --lc-collate=C\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb",
+ split(" ", $options),
+ "--template=template0",
+ "mydb");
+
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name,
+ $createdb_options,
+ $psql_options,
+ $expected_collprovider,
+ $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("psql",
+ split(" ", $psql_options),
+ "-c",
+ "create database mydb "
+ . $createdb_options
+ . " template = template0;");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+@command = ("createuser --createdb --no-superuser non_superuser");
+print "# Running: " . join(" ", @command) . "\n";
+system(@command);
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "libc",
+ "");
+
+test_createdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX libc locale",
+ "--locale=POSIX\@libc",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "C icu locale with SQL_ASCII encoding and superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "C icu locale with SQL_ASCII encoding and superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "C icu locale with SQL_ASCII encoding and non-superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII --username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and non-superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII --username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_createdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_createdb(
+ "ICU language tag format locale",
+ "--locale=und-x-icu",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_createdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C --lc-ctype=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX --lc-ctype=POSIX",
+ "libc",
+ "");
+
+test_createdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc --lc-ctype=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX libc lc_collate",
+ "--lc-collate=POSIX\@libc --lc-ctype=POSIX",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII "
+ . "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII",
+ "icu",
+ "");
+
+}
+else
+{
+ test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII "
+ . "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_createdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_createdb(
+ "ICU language tag format lc_collate",
+ "--lc-collate=und-x-icu",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+# test CREATE DATABASE
+
+# LC_COLLATE with the same LC_CTYPE if needed
+
+test_create_database(
+ "empty libc lc_collate",
+ "LC_COLLATE = '\@libc'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "C lc_collate without collation provider",
+ "LC_COLLATE = 'C' LC_CTYPE = 'C'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "POSIX lc_collate without collation provider",
+ "LC_COLLATE = 'POSIX' LC_CTYPE = 'POSIX'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "C libc lc_collate",
+ "LC_COLLATE = 'C\@libc' LC_CTYPE = 'C'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "POSIX libc lc_collate",
+ "LC_COLLATE = 'POSIX\@libc' LC_CTYPE = 'POSIX'",
+ "",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "",
+ "icu",
+ "");
+}
+else
+{
+ test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "",
+ "icu",
+ "");
+}
+else
+{
+ test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_create_database(
+ "C lc_collate too many modifiers",
+ "LC_COLLATE = 'C\@icu\@libc'",
+ "",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_create_database(
+ "ICU language tag format lc_collate",
+ "LC_COLLATE = 'und-x-icu'",
+ "",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+$node->stop;
diff --git a/src/test/default_collation/libc.utf8/.gitignore b/src/test/default_collation/libc.utf8/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/libc.utf8/Makefile b/src/test/default_collation/libc.utf8/Makefile
new file mode 100644
index 0000000000..e5b9d20958
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/libc.utf8/Makefile
+
+subdir = src/test/default_collation/libc.utf8
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/libc.utf8/t/001_default_collation.pl b/src/test/default_collation/libc.utf8/t/001_default_collation.pl
new file mode 100644
index 0000000000..e4b3552922
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/t/001_default_collation.pl
@@ -0,0 +1,703 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 168;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"libc\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+sub psql
+{
+ my ($command, $db) = @_;
+ my ($result, $in, $out, $err);
+ my @psql = ('psql', '-X', '-c', $command);
+ if (defined($db))
+ {
+ push(@psql, $db);
+ }
+ print "# Running: " . join(" ", @psql) . "\n";
+ $result = IPC::Run::run \@psql, \$in, \$out, \$err;
+ ($result, $out, $err);
+}
+
+# --locale
+
+test_initdb(
+ "be_BY\@latin libc locale",
+ "--locale=be_BY\@latin\@libc",
+ "");
+
+test_initdb(
+ "be_BY\@latin libc locale invalid modifier order",
+ "--locale=be_BY\@libc\@latin",
+ "invalid locale name \"be_BY\@libc\@latin\"");
+
+# --lc-collate
+
+test_initdb(
+ "be_BY\@latin libc lc_collate",
+ "--lc-collate=be_BY\@latin\@libc",
+ "");
+
+test_initdb(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@libc\@latin",
+ "invalid locale name \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test createdb, CREATE DATABASE and default collation behaviour
+
+sub test_createdb
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ if ($from_template0)
+ {
+ $options = $options . " --template=template0";
+ }
+
+ @command = ("createdb", split(" ", $options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command,
+ qr{\@libc\n},
+ "createdb: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ ($result, $out_command, $err_command) = psql(
+ "create database mydb "
+ . $options
+ . ($from_template0 ? " TEMPLATE = template0;" : ";"));
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command,
+ qr{\@libc\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_default_collation
+{
+ my ($createdb_options, $collation, @commands) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb", split(" ", $createdb_options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command, qr{\@libc\n}, "\"@command\" check output");
+
+ for (my $row = 0; $row <= $#commands; $row++)
+ {
+ my ($command_text, $expected) = @{$commands[$row]};
+ ($result, $out_command, $err_command) = psql($command_text, "mydb");
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ if ($out_command)
+ {
+ is(
+ $out_command,
+ $expected,
+ "default collation "
+ . $collation
+ . ": \""
+ . $command_text
+ . "\" check output");
+ }
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "be_BY\@latin libc locale",
+ "--locale=be_BY\@latin\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "be_BY\@latin libc locale invalid modifier order",
+ "--locale=be_BY\@libc\@latin",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\"");
+
+# --lc-collate
+
+test_createdb(
+ "be_BY\@latin libc lc_collate",
+ "--lc-collate=be_BY\@latin\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@libc\@latin",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+# LC_COLLATE
+
+test_create_database(
+ "be_BY\@latin libc lc_collate",
+ "LC_COLLATE = 'be_BY\@latin\@libc'",
+ 1,
+ "");
+
+test_create_database(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "LC_COLLATE = 'be_BY\@libc\@latin'",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test default collation behaviour
+# use commands and outputs from the regression test collate.linux.utf8
+
+test_default_collation(
+ "--lc-collate=en_US.utf8\@libc --template=template0",
+ "en_US.utf8\@libc",
+ (
+ [
+ "CREATE TABLE collate_test1 (a int, b text NOT NULL);",
+ "CREATE TABLE\n"
+ ],
+ [
+ "INSERT INTO collate_test1 VALUES "
+ . "(1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');",
+ "INSERT 0 4\n"],
+ [
+ "SELECT * FROM collate_test1 WHERE b >= 'bbc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 3 | bbc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # star expansion
+ [
+ "SELECT * FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # upper/lower
+ ["CREATE TABLE collate_test10 (a int, x text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test10 VALUES (1, 'hij'), (2, 'HIJ');",
+ "INSERT 0 2\n"
+ ],
+ [
+ "SELECT a, lower(x), upper(x), initcap(x) FROM collate_test10;",
+ " a | lower | upper | initcap \n"
+ . "---+-------+-------+---------\n"
+ . " 1 | hij | HIJ | Hij\n"
+ . " 2 | hij | HIJ | Hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ # LIKE/ILIKE
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ILIKE '%KI%' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ILIKE 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ # regular expressions
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE TABLE collate_test6 (a int, b text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test6 VALUES "
+ . "(1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'), "
+ . "(5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, ' '), "
+ . "(9, 'äbç'), (10, 'ÄBÇ');",
+ "INSERT 0 10\n"
+ ],
+ [
+ "SELECT b, "
+ . "b ~ '^[[:alpha:]]+\$' AS is_alpha, "
+ . "b ~ '^[[:upper:]]+\$' AS is_upper, "
+ . "b ~ '^[[:lower:]]+\$' AS is_lower, "
+ . "b ~ '^[[:digit:]]+\$' AS is_digit, "
+ . "b ~ '^[[:alnum:]]+\$' AS is_alnum, "
+ . "b ~ '^[[:graph:]]+\$' AS is_graph, "
+ . "b ~ '^[[:print:]]+\$' AS is_print, "
+ . "b ~ '^[[:punct:]]+\$' AS is_punct, "
+ . "b ~ '^[[:space:]]+\$' AS is_space "
+ . "FROM collate_test6;",
+ " b | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space \n"
+ . "-----+----------+----------+----------+----------+----------+----------+----------+----------+----------\n"
+ . " abc | t | f | t | f | t | t | t | f | f\n"
+ . " ABC | t | t | f | f | t | t | t | f | f\n"
+ . " 123 | f | f | f | t | t | t | t | f | f\n"
+ . " ab1 | f | f | f | f | t | t | t | f | f\n"
+ . " a1! | f | f | f | f | f | t | t | f | f\n"
+ . " a c | f | f | f | f | f | f | t | f | f\n"
+ . " !.; | f | f | f | f | f | t | t | t | f\n"
+ . " | f | f | f | f | f | f | t | f | t\n"
+ . " äbç | t | f | t | f | t | t | t | f | f\n"
+ . " ÄBÇ | t | t | f | f | t | t | t | f | f\n"
+ . "(10 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ~* 'KI' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ~* 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(coalesce(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;",
+ " a | b | greatest \n"
+ . "---+-----+----------\n"
+ . " 1 | abc | CCC\n"
+ . " 2 | äbc | CCC\n"
+ . " 3 | bbc | CCC\n"
+ . " 4 | ABC | CCC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, x, lower(greatest(x, 'foo')) FROM collate_test10;",
+ " a | x | lower \n"
+ . "---+-----+-------\n"
+ . " 1 | hij | hij\n"
+ . " 2 | HIJ | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;",
+ " a | nullif \n"
+ . "---+--------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 1 | \n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(nullif(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END "
+ . "FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 1 | abcd\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE DOMAIN testdomain AS text;", "CREATE DOMAIN\n", ""],
+ [
+ "SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(x::testdomain) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT min(b), max(b) FROM collate_test1;",
+ " min | max \n"
+ . "-----+-----\n"
+ . " abc | bbc\n"
+ . "(1 row)\n"
+ . "\n",
+ ""
+ ],
+ [
+ "SELECT array_agg(b ORDER BY b) FROM collate_test1;",
+ " array_agg \n"
+ . "-------------------\n"
+ . " {abc,ABC,äbc,bbc}\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 "
+ . "UNION ALL "
+ . "SELECT a, b FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 3 | bbc\n"
+ . "(8 rows)\n"
+ . "\n"
+ ],
+ # casting
+ [
+ "SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # propagation of collation in SQL functions (inlined and non-inlined
+ # cases) and plpgsql functions too
+ [
+ "CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_noninline (text, text) "
+ . "RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 limit 1 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_plpgsql (text, text) "
+ . "RETURNS boolean LANGUAGE plpgsql "
+ . "AS \$\$ begin return \$1 < \$2; end \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a.b AS a, b.b AS b, a.b < b.b AS lt, "
+ . "mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b) "
+ . "FROM collate_test1 a, collate_test1 b "
+ . "ORDER BY a.b, b.b;",
+ " a | b | lt | mylt | mylt_noninline | mylt_plpgsql \n"
+ . "-----+-----+----+------+----------------+--------------\n"
+ . " abc | abc | f | f | f | f\n"
+ . " abc | ABC | t | t | t | t\n"
+ . " abc | äbc | t | t | t | t\n"
+ . " abc | bbc | t | t | t | t\n"
+ . " ABC | abc | f | f | f | f\n"
+ . " ABC | ABC | f | f | f | f\n"
+ . " ABC | äbc | t | t | t | t\n"
+ . " ABC | bbc | t | t | t | t\n"
+ . " äbc | abc | f | f | f | f\n"
+ . " äbc | ABC | f | f | f | f\n"
+ . " äbc | äbc | f | f | f | f\n"
+ . " äbc | bbc | t | t | t | t\n"
+ . " bbc | abc | f | f | f | f\n"
+ . " bbc | ABC | f | f | f | f\n"
+ . " bbc | äbc | f | f | f | f\n"
+ . " bbc | bbc | f | f | f | f\n"
+ . "(16 rows)\n"
+ . "\n"
+ ],
+ # polymorphism
+ [
+ "SELECT * FROM unnest("
+ . "(SELECT array_agg(b ORDER BY b) FROM collate_test1)"
+ . ") ORDER BY 1;",
+ " unnest \n"
+ . "--------\n"
+ . " abc\n"
+ . " ABC\n"
+ . " äbc\n"
+ . " bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "CREATE FUNCTION dup (anyelement) RETURNS anyelement "
+ . "AS 'select \$1' LANGUAGE sql;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a, dup(b) FROM collate_test1 ORDER BY 2;",
+ " a | dup \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # indexes
+ [
+ "CREATE INDEX collate_test1_idx1 ON collate_test1 (b);",
+ "CREATE INDEX\n"
+ ]
+ )
+);
+
+$node->stop;
diff --git a/src/test/default_collation/libc/.gitignore b/src/test/default_collation/libc/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/default_collation/libc/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/libc/Makefile b/src/test/default_collation/libc/Makefile
new file mode 100644
index 0000000000..98ab736d7a
--- /dev/null
+++ b/src/test/default_collation/libc/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/libc/Makefile
+
+subdir = src/test/default_collation/libc
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/libc/t/001_default_collation.pl b/src/test/default_collation/libc/t/001_default_collation.pl
new file mode 100644
index 0000000000..bc8a6ad02c
--- /dev/null
+++ b/src/test/default_collation/libc/t/001_default_collation.pl
@@ -0,0 +1,355 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 90;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"libc\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+# empty locales
+
+test_initdb(
+ "empty locales",
+ "",
+ "");
+
+# --locale
+
+test_initdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "");
+
+test_initdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "");
+
+test_initdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "");
+
+test_initdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "");
+
+test_initdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ "ICU is not supported in this build");
+
+test_initdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "invalid locale name \"C\@icu\"");
+
+# --lc-collate
+
+test_initdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "");
+
+test_initdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ "");
+
+test_initdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ "");
+
+test_initdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ "");
+
+test_initdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu",
+ "ICU is not supported in this build");
+
+test_initdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "invalid locale name \"C\@icu\" \\(provider \"libc\"\\)");
+
+# --locale & --lc-collate
+
+test_initdb(
+ "lc_collate implicit provider takes precedence",
+ "--locale=\@icu --lc-collate=C",
+ "");
+
+test_initdb(
+ "lc_collate explicit provider takes precedence",
+ "--locale=\@icu --lc-collate=\@libc",
+ "");
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ if ($from_template0)
+ {
+ $options = $options . " --template=template0";
+ }
+
+ @command = ("createdb", split(" ", $options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ like($out_command,
+ qr{\@libc\n},
+ "createdb: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("psql",
+ "-c",
+ "create database mydb "
+ . $options
+ . ($from_template0 ? " template = template0" : "")
+ . ";");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ like($out_command,
+ qr{\@libc\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+# test createdb
+
+# empty locales
+
+test_createdb(
+ "empty locales",
+ "",
+ 0,
+ "");
+
+# --locale
+
+test_createdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ 0,
+ "");
+
+test_createdb(
+ "C locale without collation provider",
+ "--locale=C",
+ 1,
+ "");
+
+test_createdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ 1,
+ "");
+
+test_createdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ 1,
+ "ICU is not supported in this build");
+
+test_createdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ 1,
+ "invalid locale name: \"C\@icu\"");
+
+# --lc-collate
+
+test_createdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ 0,
+ "");
+
+test_createdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ 1,
+ "");
+test_createdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ 1,
+ "");
+
+test_createdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu",
+ 1,
+ "ICU is not supported in this build");
+
+test_createdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ 1,
+ "invalid locale name: \"C\@icu\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+# empty locales
+
+test_create_database(
+ "empty locales",
+ "",
+ 0,
+ "");
+
+# LC_COLLATE
+
+test_create_database(
+ "empty libc lc_collate",
+ "LC_COLLATE = '\@libc'",
+ 0,
+ "");
+
+test_create_database(
+ "C lc_collate without collation provider",
+ "LC_COLLATE = 'C'",
+ 1,
+ "");
+test_create_database(
+ "POSIX lc_collate without collation provider",
+ "LC_COLLATE = 'POSIX'",
+ 1,
+ "");
+
+test_create_database(
+ "C libc lc_collate",
+ "LC_COLLATE = 'C\@libc'",
+ 1,
+ "");
+
+test_create_database(
+ "C icu lc_collate",
+ "LC_COLLATE = 'C\@icu'",
+ 1,
+ "ICU is not supported in this build");
+
+test_create_database(
+ "C lc_collate too many modifiers",
+ "LC_COLLATE = 'C\@icu\@libc'",
+ 1,
+ "invalid locale name: \"C\@icu\" \\(provider \"libc\"\\)");
+
+$node->stop;
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index f485b5c330..3fae21e09a 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -979,11 +979,14 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
-- schema manipulation commands
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -991,7 +994,7 @@ ERROR: collation "test0" already exists
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
@@ -1102,7 +1105,7 @@ drop type textrange_c;
drop type textrange_en_us;
-- cleanup
DROP SCHEMA collate_tests CASCADE;
-NOTICE: drop cascades to 18 other objects
+NOTICE: drop cascades to 19 other objects
DETAIL: drop cascades to table collate_test1
drop cascades to table collate_test_like
drop cascades to table collate_test2
@@ -1121,6 +1124,7 @@ drop cascades to function mylt_noninline(text,text)
drop cascades to function mylt_plpgsql(text,text)
drop cascades to function mylt2(text,text)
drop cascades to function dup(anyelement)
+drop cascades to function get_lc_collate(text)
RESET search_path;
-- leave a collation for pg_upgrade test
CREATE COLLATION coll_icu_upgrade FROM "und-x-icu";
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out
index 400a747cdc..7aa8057323 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.linux.utf8.out
@@ -988,11 +988,14 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
-- schema manipulation commands
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -1004,7 +1007,7 @@ NOTICE: collation "test0" for encoding "UTF8" already exists, skipping
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
@@ -1119,7 +1122,7 @@ drop type textrange_c;
drop type textrange_en_us;
-- cleanup
DROP SCHEMA collate_tests CASCADE;
-NOTICE: drop cascades to 18 other objects
+NOTICE: drop cascades to 19 other objects
DETAIL: drop cascades to table collate_test1
drop cascades to table collate_test_like
drop cascades to table collate_test2
@@ -1138,3 +1141,4 @@ drop cascades to function mylt_noninline(text,text)
drop cascades to function mylt_plpgsql(text,text)
drop cascades to function mylt2(text,text)
drop cascades to function dup(anyelement)
+drop cascades to function get_lc_collate(text)
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index ef39445b30..936d684e10 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -339,18 +339,22 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
+
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.linux.utf8.sql
index b51162e3a1..e03ea1bde1 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.linux.utf8.sql
@@ -339,11 +339,15 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
+
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -352,7 +356,7 @@ CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 59bed3b8a8..a5bf7e1703 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -51,12 +51,19 @@ my @contrib_excludes = (
'snapshot_too_old');
# Set of variables for frontend modules
-my $frontend_defines = { 'initdb' => 'FRONTEND' };
+my $frontend_defines = {
+ 'initdb' => 'FRONTEND',
+ 'psql' => 'FRONTEND',
+ 'pg_dump' => 'FRONTEND',
+ 'pg_dumpall' => 'FRONTEND',
+ 'pg_restore' => 'FRONTEND',
+ };
my @frontend_uselibpq = ('pg_ctl', 'pg_upgrade', 'pgbench', 'psql', 'initdb');
my @frontend_uselibpgport = (
'pg_archivecleanup', 'pg_test_fsync',
'pg_test_timing', 'pg_upgrade',
'pg_waldump', 'pgbench');
+my @iculibs = ('icuin.lib', 'icuuc.lib');
my @frontend_uselibpgcommon = (
'pg_archivecleanup', 'pg_test_fsync',
'pg_test_timing', 'pg_upgrade',
@@ -65,8 +72,10 @@ my $frontend_extralibs = {
'initdb' => ['ws2_32.lib'],
'pg_restore' => ['ws2_32.lib'],
'pgbench' => ['ws2_32.lib'],
+ 'mchar' => [@iculibs],
'psql' => ['ws2_32.lib']
};
+my @frontend_iculibs = ('initdb', 'pg_upgrade');
my $frontend_extraincludes = {
'initdb' => ['src/timezone'],
'psql' => ['src/backend']
@@ -117,7 +126,7 @@ sub mkvcbuild
our @pgcommonallfiles = qw(
base64.c config_info.c controldata_utils.c exec.c file_perm.c ip.c
- keywords.c link-canary.c md5.c
+ keywords.c link-canary.c md5.c pg_collation_fn_common.c
pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c
saslprep.c scram-common.c string.c unicode_norm.c username.c
wait_error.c);
@@ -152,6 +161,7 @@ sub mkvcbuild
$libpgfeutils->AddDefine('FRONTEND');
$libpgfeutils->AddIncludeDir('src/interfaces/libpq');
$libpgfeutils->AddFiles('src/fe_utils', @pgfeutilsfiles);
+ $libpgfeutils->AddFile('src/common/pg_collation_fn_common.c');
$postgres = $solution->AddProject('postgres', 'exe', '', 'src/backend');
$postgres->AddIncludeDir('src/backend');
@@ -233,6 +243,7 @@ sub mkvcbuild
'src/interfaces/libpq');
$libpq->AddDefine('FRONTEND');
$libpq->AddDefine('UNSAFE_STAT_OK');
+ $libpq->AddDefine('LIBPQ_MAKE');
$libpq->AddIncludeDir('src/port');
$libpq->AddLibrary('secur32.lib');
$libpq->AddLibrary('ws2_32.lib');
@@ -241,6 +252,7 @@ sub mkvcbuild
$libpq->ReplaceFile('src/interfaces/libpq/libpqrc.c',
'src/interfaces/libpq/libpq.rc');
$libpq->AddReference($libpgcommon, $libpgport);
+ $libpq->AddFile('src/common/pg_collation_fn_common.c');
# The OBJS scraper doesn't know about ifdefs, so remove appropriate files
# if building without OpenSSL.
@@ -419,6 +431,12 @@ sub mkvcbuild
{
push @contrib_excludes, 'uuid-ossp';
}
+ else
+ {
+ foreach my $fe (@frontend_iculibs) {
+ push @{$frontend_extralibs->{$fe}}, @iculibs;
+ }
+ }
# AddProject() does not recognize the constructs used to populate OBJS in
# the pgcrypto Makefile, so it will discover no files.
--
2.17.1 (Apple Git-112)
On Tue, Oct 30, 2018 at 9:07 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
Hi!
2 окт. 2018 г., в 11:37, Michael Paquier <michael@paquier.xyz> написал(а):
Please note that the latest patch set does not apply, so this has been
switched to commit fest 2018-11, waiting on author for a rebase.PFA rebased version. I've added LDFLAGS_INTERNAL += $(ICU_LIBS) in libpq,
Thanks for providing the rebased version.
but I'm not entirely sure this is correct way to deal with complaints on ICU
functions from libpq linking.
Well, it was enough on my own Gentoo, where patch actually compiles without
errors and pass "make check". But for cfbot it doesn't compile, I'm not sure
why.
../../../src/interfaces/libpq/libpq.so: undefined reference to `ucol_open_52'
../../../src/interfaces/libpq/libpq.so: undefined reference to
`uloc_toLanguageTag_52'
../../../src/interfaces/libpq/libpq.so: undefined reference to `u_errorName_52'
../../../src/interfaces/libpq/libpq.so: undefined reference to
`ucol_getVersion_52'
../../../src/interfaces/libpq/libpq.so: undefined reference to
`u_versionToString_52'
../../../src/interfaces/libpq/libpq.so: undefined reference to
`uloc_setDefault_52'
../../../src/interfaces/libpq/libpq.so: undefined reference to
`uloc_getDefault_52'
../../../src/interfaces/libpq/libpq.so: undefined reference to `ucol_close_52'
As a side note, I'm a bit confused, who is the original author of the proposed
patch? If it's Marina, why she isn't involved in the discussion or even
mentioned in the patch itself?
Dmitry Dolgov wrote:
As a side note, I'm a bit confused, who is the original author of
the proposed patch? If it's Marina, why she isn't involved in the
discussion or even mentioned in the patch itself?
The original patch [1]<37A534BE-CBF7-467C-B096-0AAD25091A9F@yandex-team.ru> starts with these commit metadata:
From e1cb130f550952d9c9c2d9ad1c52e60699a2c968 Mon Sep 17 00:00:00 2001
From: Marina Polyakova <m.polyakova@postgrespro.ru>
Date: Fri, 9 Feb 2018 18:57:25 +0300
Subject: [PATCH] ICU as default collation provider
... commit message...
but as you note, Marina did not intervene in the discussion nor
submitted it herself, so this patch misses someone to play the
role of the author in the CF process.
There were reviews: Andrey Borodin raised issues with the patch in
[2]: <92826DEB-DA8F-4AE4-9C43-03A55D18A766@yandex-team.ru>
in [3]<7e86a1aa-a942-4f6b-978e-e9013a4af3fb@manitou-mail.org>, but no one followed up on them within the next months.
About the status of the patch, to me it should be RWF. It's been
moved to the next CF several times with no progress besides rebases.
[1]: <37A534BE-CBF7-467C-B096-0AAD25091A9F@yandex-team.ru>
[2]: <92826DEB-DA8F-4AE4-9C43-03A55D18A766@yandex-team.ru>
[3]: <7e86a1aa-a942-4f6b-978e-e9013a4af3fb@manitou-mail.org>
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
On Sat, Dec 1, 2018 at 9:08 PM Daniel Verite <daniel@manitou-mail.org> wrote:
Dmitry Dolgov wrote:
As a side note, I'm a bit confused, who is the original author of
the proposed patch? If it's Marina, why she isn't involved in the
discussion or even mentioned in the patch itself?The original patch [1] starts with these commit metadata:
From e1cb130f550952d9c9c2d9ad1c52e60699a2c968 Mon Sep 17 00:00:00 2001
From: Marina Polyakova <m.polyakova@postgrespro.ru>
Date: Fri, 9 Feb 2018 18:57:25 +0300
Subject: [PATCH] ICU as default collation provider
... commit message...but as you note, Marina did not intervene in the discussion nor
submitted it herself, so this patch misses someone to play the
role of the author in the CF process.
Yes, I've missed that, thank you. But there was no references like that in the
last rebased version.
There were reviews: Andrey Borodin raised issues with the patch in
[2], I spent some time trying it and asked questions about the design
in [3], but no one followed up on them within the next months.About the status of the patch, to me it should be RWF. It's been
moved to the next CF several times with no progress besides rebases.
Let me disagree. Judging from the commentaries in this discussion it could be
significant and useful feature, and the author is trying to keep this patch
uptodate. The lack of reviews could be due other reasons than desirability of
the patch (as well as as for many other interesting proposals in hackers).
Hi, Dmitry, Daniel!
2 дек. 2018 г., в 17:22, Dmitry Dolgov <9erthalion6@gmail.com> написал(а):
There were reviews: Andrey Borodin raised issues with the patch in
[2], I spent some time trying it and asked questions about the design
in [3], but no one followed up on them within the next months.About the status of the patch, to me it should be RWF. It's been
moved to the next CF several times with no progress besides rebases.Let me disagree. Judging from the commentaries in this discussion it could be
significant and useful feature, and the author is trying to keep this patch
uptodate. The lack of reviews could be due other reasons than desirability of
the patch (as well as as for many other interesting proposals in hackers).
Regarding status of this patch: Marina is the original author of the patch, but I'm interested in pushing it. Basically, I've asked Marina to provide some code from Postgres Pro to discuss the feature.
Daniel have raised important interface question in his review. Using libc-style locale in lc_collate is not a perfect choice for many ICU-only collations.
I'd work on patch if I knew how to improve the interface, but I need input from community: how this interface should look like.
I have intention to provide some solution for this question before next CF, but found no enough time during this CF (beside rebase).
Best regards, Andrey Borodin.
On Sun, Dec 2, 2018 at 4:21 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:
About the status of the patch, to me it should be RWF. It's been
moved to the next CF several times with no progress besides rebases.Let me disagree. Judging from the commentaries in this discussion it could be
significant and useful feature, and the author is trying to keep this patch
uptodate. The lack of reviews could be due other reasons than desirability of
the patch (as well as as for many other interesting proposals in hackers).
+1. I, for one, care about this feature. We're all very busy, but I
don't want to see it die.
--
Peter Geoghegan
On 02/12/2018 15:40, Andrey Borodin wrote:
Daniel have raised important interface question in his review. Using libc-style locale in lc_collate is not a perfect choice for many ICU-only collations.
I'd work on patch if I knew how to improve the interface, but I need input from community: how this interface should look like.
Figuring out the interface is the hard part. Several options have been
discussed in this thread and earlier threads.
My current thinking is that we should add a datcollprovider column to
pg_database, and then store the ICU locale name in datcollate. So
mirror the columns of pg_collation in pg_database.
Another issue is that we'd need to carefully divide up the role of the
"default" collation and the "default" provider. The default collation
is the collation defined for the database, the default provider means to
use the libc non-locale_t enabled API functions. Right now these are
always the same, but if the database-global locale is ICU, then the
default collation would use the ICU provider.
My recently posted patch "Reorganize collation lookup time and place" is
meant to help reorganize the APIs to make this simpler. It doesn't have
all the answers yet, but I think it's a step in this direction.
If we have well-designed answers to these questions, I'd imagine that
the actual feature patch would be quite small. I was very surprised to
see how large this patch is and how much code is moves around without
much explanation. I don't think it's worth reviewing this patch any
further. It needs several steps back and some fundamental design and
refactoring work.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut wrote:
Another issue is that we'd need to carefully divide up the role of the
"default" collation and the "default" provider. The default collation
is the collation defined for the database, the default provider means to
use the libc non-locale_t enabled API functions. Right now these are
always the same, but if the database-global locale is ICU, then the
default collation would use the ICU provider.
I think one related issue that the patch works around by using a libc locale
as a proxy is knowing what to put into libc's LC_CTYPE and LC_COLLATE.
In fact I've been wondering if that's the main reason for the interface
implemented by the patch.
Otherwise, how should these env variables be initialized for ICU
databases?
For instance in the existing FTS code, lowerstr_with_len() in
tsearch/ts_locale.c calls tolower() or towlower() to fold a string to
lower case when normalizing lexemes. This requires LC_CTYPE to be set
to something compatible with the database encoding, at the very
least. Even if that code looks like it might need to be changed for
ICU anyway (or just to be collation-aware according to the TODO marks?),
what about comparable calls in extensions?
In the case that we don't touch libc's LC_COLLATE/LC_CTYPE in backends,
extension code would have them inherited from the postmaster? Does that
sound acceptable? If not, maybe ICU databases should have these as
settable options, in addition to their ICU locale?
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
On 12/12/2018 15:57, Daniel Verite wrote:
I think one related issue that the patch works around by using a libc locale
as a proxy is knowing what to put into libc's LC_CTYPE and LC_COLLATE.
In fact I've been wondering if that's the main reason for the interface
implemented by the patch.
So it seems, but then it should be called out more clearly.
Otherwise, how should these env variables be initialized for ICU
databases?
I think when using ICU by default, then it should not matter because we
shouldn't be calling any libc functions that use those settings. Maybe
there need to be some exceptions, but again we should call those out
more clearly.
We could set them to "C" for consistency perhaps.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Dec 11, 2018 at 09:22:48AM +0100, Peter Eisentraut wrote:
If we have well-designed answers to these questions, I'd imagine that
the actual feature patch would be quite small. I was very surprised to
see how large this patch is and how much code is moves around without
much explanation. I don't think it's worth reviewing this patch any
further. It needs several steps back and some fundamental design and
refactoring work.
Marked as returned with feedback. This thread has stalled.
--
Michael
Hello Andrey,
we would like to see ICU collations become the default for entire
databases as well. Therefore we would also review the patch.
Unfortunately your Patch from late October does not apply on the current
master.
Besides of that I noticed the patch applies on master of October but
results in errors when compiling without "--with-icu" and executing
"make check-world":
libpq.so: Warning: undefined reference to »get_collprovider_name«
libpq.so: Warning: undefined reference to
»is_valid_nondefault_collprovider«
libpq.so: Warning: undefined reference to »get_collprovider«
May be caused by your last modification:
I've added LDFLAGS_INTERNAL += $(ICU_LIBS) in libpq, but I'm not
entirely sure this is correct way to deal with complaints on ICU
functions from libpq linking.
Best regards,
Marius Timmer
--
Westfälische Wilhelms-Universität Münster (WWU)
Zentrum für Informationsverarbeitung (ZIV)
Röntgenstraße 7-13
48149 Münster
+49 251 83 31158
marius.timmer@uni-muenster.de
https://www.uni-muenster.de/ZIV
Hello hackers,
as I mentioned three weeks ago the patch from October 2018 did not apply
on the master. In the meantime I rebased it. Additionally I fixed some
Makefiles because a few icu-libs were missing. Now this patch applies
and compiles successfully on my machine. After installing running "make
installcheck-world" results in some failures (for example "select"). I
will take a closer look at those failures and review the whole patch in
the next few days. I just wanted to avoid that you have to do the same
rebasing stuff. The new patch is attached to this mail.
Maybe this patch should be moved to the current commit fest to keep
track of it. What do you think?
Best regards,
Marius Timmer
--
Westfälische Wilhelms-Universität Münster (WWU)
Zentrum für Informationsverarbeitung (ZIV)
Röntgenstraße 7-13
Besucheradresse: Einsteinstraße 60 - Raum 101
48149 Münster
+49 251 83 31158
marius.timmer@uni-muenster.de
https://www.uni-muenster.de/ZIV
Attachments:
0002-ICU-as-default-collation-provider.patchtext/x-patch; name=0002-ICU-as-default-collation-provider.patchDownload
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index a6143ef8a7..8a46d3d311 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -537,6 +537,61 @@ SELECT * FROM test1 ORDER BY a || b COLLATE "fr_FR";
a database.
</para>
+ <para>
+ You can specify the default collation provider with the <option>--locale</option>
+ and <option>--lc-collate</option> options of the <xref linkend="app-initdb"/> or
+ <xref linkend="app-createdb"/> commands, as follows:
+<programlisting>
+--locale=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]
+--lc-collate=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]
+</programlisting>
+ where <replaceable>provider</replaceable> can take the <literal>icu</literal>
+ or <literal>libc</literal> value, and <replaceable>locale</replaceable> is specified
+ in the <literal>libc</literal> format. You can only specify a single
+ locale provider after the <literal>@</literal> symbol.
+ The <literal>--lc-collate</literal> option overrides the
+ <literal>--locale</literal> setting, regardless of whether it specifies the
+ collation provider.
+ </para>
+
+ <para>
+ If you omit the collation provider options, <literal>libc</literal>
+ provider is used for <literal>C</literal> and <literal>POSIX</literal>
+ locales. For other locales, the default providers are:
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>
+ <literal>icu</literal> at the cluster level
+ </para>
+ </listitem>
+ <listitem>
+ <para>Default collation provider from the template database at
+ the database level
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <important>
+ <para>
+ You can only use the <literal>icu</literal> collation provider for locales that are
+ supported by <literal>libc</literal> in your operating system and satisfy all
+ restrictions applicable to <literal>icu</literal>.
+ </para>
+ </important>
+
+ <para>
+ When you connect to a database,
+ <productname>PostgreSQL</productname> checks that the selected collation
+ provider and the version of the default collation are supported.
+ You can find the default database collation and the collation provider
+ in <structname>pg_database.datcollate</structname>. For ICU collations, collation version is
+ also stored:
+ <programlisting>
+<replaceable>locale</replaceable>@<replaceable>provider</replaceable>[.<replaceable>version</replaceable>]
+</programlisting>
+ </para>
+
<sect3>
<title>Standard Collations</title>
diff --git a/doc/src/sgml/ref/create_database.sgml b/doc/src/sgml/ref/create_database.sgml
index b2c9e241c2..8b2e153651 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -25,7 +25,7 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
[ [ WITH ] [ OWNER [=] <replaceable class="parameter">user_name</replaceable> ]
[ TEMPLATE [=] <replaceable class="parameter">template</replaceable> ]
[ ENCODING [=] <replaceable class="parameter">encoding</replaceable> ]
- [ LC_COLLATE [=] <replaceable class="parameter">lc_collate</replaceable> ]
+ [ LC_COLLATE [=] <replaceable class="parameter">lc_collate</replaceable>[@<replaceable class="parameter">provider</replaceable>] ]
[ LC_CTYPE [=] <replaceable class="parameter">lc_ctype</replaceable> ]
[ TABLESPACE [=] <replaceable class="parameter">tablespace_name</replaceable> ]
[ ALLOW_CONNECTIONS [=] <replaceable class="parameter">allowconn</replaceable> ]
@@ -112,13 +112,17 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
</listitem>
</varlistentry>
<varlistentry>
- <term><replaceable class="parameter">lc_collate</replaceable></term>
+ <term><replaceable class="parameter">lc_collate</replaceable>[@<replaceable class="parameter">provider</replaceable>]</term>
<listitem>
<para>
Collation order (<literal>LC_COLLATE</literal>) to use in the new database.
This affects the sort order applied to strings, e.g. in queries with
ORDER BY, as well as the order used in indexes on text columns.
The default is to use the collation order of the template database.
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol, as explained in
+ <xref linkend="collation-managing"/>. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>.
See below for additional restrictions.
</para>
</listitem>
diff --git a/doc/src/sgml/ref/createdb.sgml b/doc/src/sgml/ref/createdb.sgml
index 2658efeb1a..dbf87d31ec 100644
--- a/doc/src/sgml/ref/createdb.sgml
+++ b/doc/src/sgml/ref/createdb.sgml
@@ -121,22 +121,34 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>-l <replaceable class="parameter">locale</replaceable></option></term>
- <term><option>--locale=<replaceable class="parameter">locale</replaceable></option></term>
+ <term><option>-l <replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
+ <term><option>--locale=<replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Specifies the locale to be used in this database. This is equivalent
to specifying both <option>--lc-collate</option> and <option>--lc-ctype</option>.
</para>
+
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
<varlistentry>
- <term><option>--lc-collate=<replaceable class="parameter">locale</replaceable></option></term>
+ <term><option>--lc-collate=<replaceable class="parameter">locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Specifies the LC_COLLATE setting to be used in this database.
</para>
+
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 84fb37c293..7fe0bc27a7 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -224,7 +224,7 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>--locale=<replaceable>locale</replaceable></option></term>
+ <term><option>--locale=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<listitem>
<para>
Sets the default locale for the database cluster. If this
@@ -232,11 +232,16 @@ PostgreSQL documentation
environment that <command>initdb</command> runs in. Locale
support is described in <xref linkend="locale"/>.
</para>
+ <para>
+ Optionally, you can specify the default collation provider after the
+ <literal>@</literal> symbol. Supported values are <literal>icu</literal>
+ and <literal>libc</literal>. For details, see <xref linkend="collation-managing"/>.
+ </para>
</listitem>
</varlistentry>
<varlistentry>
- <term><option>--lc-collate=<replaceable>locale</replaceable></option></term>
+ <term><option>--lc-collate=<replaceable>locale</replaceable>[@<replaceable>provider</replaceable>]</option></term>
<term><option>--lc-ctype=<replaceable>locale</replaceable></option></term>
<term><option>--lc-messages=<replaceable>locale</replaceable></option></term>
<term><option>--lc-monetary=<replaceable>locale</replaceable></option></term>
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index f08a0ee15d..18df6d0ab2 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -327,6 +327,23 @@ make check EXTRA_TESTS='collate.icu.utf8 collate.linux.utf8' LANG=en_US.utf8
</para>
</sect2>
+ <sect2>
+ <title>Extra TAP Tests for Default Collations</title>
+
+ <para>
+ To test the default collations on Linux/glibc platforms,
+ you can run extra TAP tests, as follows:
+<screen>
+make -C src/test/default_collation check-utf8
+</screen>
+ These tests only succeed when run in a database that uses the UTF-8
+ encoding. As these tests are TAP-based, you can only run them if
+ <productname>PostgreSQL</productname> was configured with the
+ <option>--enable-tap-tests</option> option.
+ For details, see <xref linkend="regress-tap"/>.
+ </para>
+ </sect2>
+
<sect2>
<title>Testing Hot Standby</title>
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 94e482596f..0b0daa0afd 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -397,7 +397,7 @@ CREATE VIEW character_sets AS
CAST(c.collname AS sql_identifier) AS default_collate_name
FROM pg_database d
LEFT JOIN (pg_collation c JOIN pg_namespace nc ON (c.collnamespace = nc.oid))
- ON (datcollate = collcollate AND datctype = collctype)
+ ON (datcollate = (collcollate || '@libc') AND datctype = collctype)
WHERE d.datname = current_database()
ORDER BY char_length(c.collname) DESC, c.collname ASC -- prefer full/canonical name
LIMIT 1;
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index ed3f1c12e5..fda36bedad 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -27,6 +27,7 @@
#include "commands/comment.h"
#include "commands/dbcommands.h"
#include "commands/defrem.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "utils/builtins.h"
@@ -162,11 +163,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
if (collproviderstr)
{
- if (pg_strcasecmp(collproviderstr, "icu") == 0)
- collprovider = COLLPROVIDER_ICU;
- else if (pg_strcasecmp(collproviderstr, "libc") == 0)
- collprovider = COLLPROVIDER_LIBC;
- else
+ collprovider = get_collprovider(collproviderstr);
+ if (!is_valid_nondefault_collprovider(collprovider))
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("unrecognized collation provider: %s",
@@ -192,7 +190,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
else
{
collencoding = GetDatabaseEncoding();
- check_encoding_locale_matches(collencoding, collcollate, collctype);
+ check_encoding_locale_matches(collencoding, collcollate, collctype,
+ collprovider);
}
}
@@ -433,26 +432,6 @@ cmpaliases(const void *a, const void *b)
#ifdef USE_ICU
-/*
- * Get the ICU language tag for a locale name.
- * The result is a palloc'd string.
- */
-static char *
-get_icu_language_tag(const char *localename)
-{
- char buf[ULOC_FULLNAME_CAPACITY];
- UErrorCode status;
-
- status = U_ZERO_ERROR;
- uloc_toLanguageTag(localename, buf, sizeof(buf), TRUE, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not convert locale name \"%s\" to language tag: %s",
- localename, u_errorName(status))));
-
- return pstrdup(buf);
-}
-
/*
* Get a comment (specifically, the display name) for an ICU locale.
* The result is a palloc'd string, or NULL if we can't get a comment
@@ -698,7 +677,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
name = uloc_getAvailable(i);
langtag = get_icu_language_tag(name);
- collcollate = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : name;
+ collcollate = get_icu_collate(name, langtag);
/*
* Be paranoid about not allowing any non-ASCII strings into
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 35cad0b629..cf758b4320 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -35,6 +35,7 @@
#include "catalog/indexing.h"
#include "catalog/objectaccess.h"
#include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
#include "catalog/pg_db_role_setting.h"
#include "catalog/pg_subscription.h"
@@ -45,6 +46,7 @@
#include "commands/defrem.h"
#include "commands/seclabel.h"
#include "commands/tablespace.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -141,6 +143,14 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
int notherbackends;
int npreparedxacts;
createdb_failure_params fparms;
+ char *src_canonname;
+ char src_collprovider;
+ char *dbcanonname = NULL;
+ char dbcollprovider;
+ char *dbcollate_full_name;
+ char *icu_wincollate = NULL;
+ char *langtag = NULL;
+ const char *collate;
/* Extract options from the statement node tree */
foreach(option, stmt->options)
@@ -350,8 +360,28 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
/* If encoding or locales are defaulted, use source's setting */
if (encoding < 0)
encoding = src_encoding;
+
+ check_locale_collprovider(src_collate, &src_canonname, &src_collprovider,
+ NULL);
+
+ if (!is_valid_nondefault_collprovider(src_collprovider))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of template database \"%s\"",
+ src_collate, dbtemplate)));
+
if (dbcollate == NULL)
- dbcollate = src_collate;
+ {
+ dbcollate = src_canonname;
+ dbcollprovider = src_collprovider;
+ }
+ else
+ {
+ check_locale_collprovider(dbcollate, &dbcanonname, &dbcollprovider,
+ NULL);
+ dbcollate = dbcanonname;
+ }
+
if (dbctype == NULL)
dbctype = src_ctype;
@@ -362,18 +392,88 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
errmsg("invalid server encoding %d", encoding)));
/* Check that the chosen locales are valid, and get canonical spellings */
- if (!check_locale(LC_COLLATE, dbcollate, &canonname))
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("invalid locale name: \"%s\"", dbcollate)));
- dbcollate = canonname;
- if (!check_locale(LC_CTYPE, dbctype, &canonname))
+
+ if (!check_locale(LC_CTYPE, dbctype, &canonname, '\0'))
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("invalid locale name: \"%s\"", dbctype)));
dbctype = canonname;
- check_encoding_locale_matches(encoding, dbcollate, dbctype);
+ /* we always check lc_collate for libc */
+ if (!check_locale(LC_COLLATE, dbcollate, &canonname, COLLPROVIDER_LIBC))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("invalid locale name: \"%s\" (provider \"%s\")",
+ dbcollate, get_collprovider_name(COLLPROVIDER_LIBC))));
+ dbcollate = canonname;
+
+ /* determine the collation provider if we haven't already done it */
+ if (!is_valid_nondefault_collprovider(dbcollprovider))
+ {
+ if (locale_is_c(dbcollate))
+ dbcollprovider = COLLPROVIDER_LIBC;
+ else
+ dbcollprovider = src_collprovider;
+ }
+
+ Assert(is_valid_nondefault_collprovider(dbcollprovider));
+
+#ifndef USE_ICU
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"),
+ errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif
+
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ {
+ if (!check_locale(LC_COLLATE, dbcollate, NULL, dbcollprovider))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("invalid locale name: \"%s\" (provider \"%s\")",
+ dbcollate, get_collprovider_name(dbcollprovider))));
+
+ if (strcmp(dbcollate, dbctype) != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("collations with different collate and ctype values are not supported by ICU")));
+ }
+
+ check_encoding_locale_matches(encoding, dbcollate, dbctype, dbcollprovider);
+
+ /* get the collation version */
+
+#ifdef USE_ICU
+ if (dbcollprovider == COLLPROVIDER_ICU)
+ {
+ collate = (const char *) dbcollate;
+#ifdef WIN32
+ if (!locale_is_c(collate))
+ {
+ icu_wincollate = check_icu_winlocale(collate);
+ collate = (const char *) icu_wincollate;
+ }
+#endif /* WIN32 */
+ langtag = get_icu_language_tag(collate);
+ collate = get_icu_collate(collate, langtag);
+ }
+ else
+#endif /* USE_ICU */
+ {
+ /* COLLPROVIDER_LIBC */
+ collate = (const char *) dbcollate;
+ }
+
+ dbcollate_full_name = get_full_collation_name(
+ dbcollate, dbcollprovider,
+ get_collation_actual_version(dbcollprovider, collate));
+
+ if (strlen(dbcollate_full_name) >= NAMEDATALEN)
+ ereport(ERROR,
+ (errmsg("the full database collation name \"%s\" is too long",
+ dbcollate_full_name)));
/*
* Check that the new encoding and locale settings match the source
@@ -395,11 +495,11 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
pg_encoding_to_char(src_encoding)),
errhint("Use the same encoding as in the template database, or use template0 as template.")));
- if (strcmp(dbcollate, src_collate) != 0)
+ if (strcmp(dbcollate_full_name, src_collate) != 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("new collation (%s) is incompatible with the collation of the template database (%s)",
- dbcollate, src_collate),
+ dbcollate_full_name, src_collate),
errhint("Use the same collation as in the template database, or use template0 as template.")));
if (strcmp(dbctype, src_ctype) != 0)
@@ -524,7 +624,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
new_record[Anum_pg_database_datdba - 1] = ObjectIdGetDatum(datdba);
new_record[Anum_pg_database_encoding - 1] = Int32GetDatum(encoding);
new_record[Anum_pg_database_datcollate - 1] =
- DirectFunctionCall1(namein, CStringGetDatum(dbcollate));
+ DirectFunctionCall1(namein, CStringGetDatum(dbcollate_full_name));
new_record[Anum_pg_database_datctype - 1] =
DirectFunctionCall1(namein, CStringGetDatum(dbctype));
new_record[Anum_pg_database_datistemplate - 1] = BoolGetDatum(dbistemplate);
@@ -691,6 +791,16 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
*/
ForceSyncCommit();
}
+
+ pfree(src_canonname);
+ pfree(dbcollate_full_name);
+ if (dbcanonname)
+ pfree(dbcanonname);
+ if (langtag)
+ pfree(langtag);
+ if (icu_wincollate)
+ pfree(icu_wincollate);
+
PG_END_ENSURE_ERROR_CLEANUP(createdb_failure_callback,
PointerGetDatum(&fparms));
@@ -720,7 +830,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
* Note: if you change this policy, fix initdb to match.
*/
void
-check_encoding_locale_matches(int encoding, const char *collate, const char *ctype)
+check_encoding_locale_matches(int encoding, const char *collate, const char *ctype,
+ char collprovider)
{
int ctype_encoding = pg_get_encoding_from_locale(ctype, true);
int collate_encoding = pg_get_encoding_from_locale(collate, true);
@@ -754,6 +865,23 @@ check_encoding_locale_matches(int encoding, const char *collate, const char *cty
collate),
errdetail("The chosen LC_COLLATE setting requires encoding \"%s\".",
pg_encoding_to_char(collate_encoding))));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ if (!(is_encoding_supported_by_icu(encoding) ||
+ (encoding == PG_SQL_ASCII && superuser())))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("encoding \"%s\" is not supported for ICU locales",
+ pg_encoding_to_char(encoding))));
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"),
+ errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif
+ }
}
/* Error cleanup callback for createdb */
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 7b18f8c758..4930251318 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -32,6 +32,7 @@
#endif
#include "bootstrap/bootstrap.h"
+#include "catalog/pg_collation.h"
#include "common/username.h"
#include "port/atomics.h"
#include "postmaster/postmaster.h"
@@ -306,8 +307,8 @@ startup_hacks(const char *progname)
static void
init_locale(const char *categoryname, int category, const char *locale)
{
- if (pg_perm_setlocale(category, locale) == NULL &&
- pg_perm_setlocale(category, "C") == NULL)
+ if (pg_perm_setlocale(category, locale, COLLPROVIDER_LIBC) == NULL &&
+ pg_perm_setlocale(category, "C", COLLPROVIDER_LIBC) == NULL)
elog(FATAL, "could not adopt \"%s\" locale nor C locale for %s",
locale, categoryname);
}
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index a8c0b156fa..8f30d31a75 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -16,6 +16,7 @@
*/
#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
#include "utils/pg_locale.h"
/*
@@ -240,8 +241,13 @@ pg_set_regex_collation(Oid collation)
}
else
{
+ char collprovider;
+
if (collation == DEFAULT_COLLATION_OID)
+ {
pg_regex_locale = 0;
+ collprovider = get_default_collprovider();
+ }
else if (OidIsValid(collation))
{
/*
@@ -250,6 +256,7 @@ pg_set_regex_collation(Oid collation)
* have to be considered below.
*/
pg_regex_locale = pg_newlocale_from_collation(collation);
+ collprovider = pg_regex_locale->provider;
}
else
{
@@ -263,24 +270,35 @@ pg_set_regex_collation(Oid collation)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
#ifdef USE_ICU
- if (pg_regex_locale && pg_regex_locale->provider == COLLPROVIDER_ICU)
pg_regex_strategy = PG_REGEX_LOCALE_ICU;
- else
+#else
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- if (pg_regex_locale)
- pg_regex_strategy = PG_REGEX_LOCALE_WIDE_L;
- else
- pg_regex_strategy = PG_REGEX_LOCALE_WIDE;
}
else
{
- if (pg_regex_locale)
- pg_regex_strategy = PG_REGEX_LOCALE_1BYTE_L;
+ /* COLLPROVIDER_LIBC */
+
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ if (pg_regex_locale)
+ pg_regex_strategy = PG_REGEX_LOCALE_WIDE_L;
+ else
+ pg_regex_strategy = PG_REGEX_LOCALE_WIDE;
+ }
else
- pg_regex_strategy = PG_REGEX_LOCALE_1BYTE;
+ {
+ if (pg_regex_locale)
+ pg_regex_strategy = PG_REGEX_LOCALE_1BYTE_L;
+ else
+ pg_regex_strategy = PG_REGEX_LOCALE_1BYTE;
+ }
}
pg_regex_collation = collation;
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index df1db7bc9f..8933317a7d 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -1479,7 +1479,7 @@ typedef int32_t (*ICU_Convert_Func) (UChar *dest, int32_t destCapacity,
UErrorCode *pErrorCode);
static int32_t
-icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
+icu_convert_case(ICU_Convert_Func func, const char *locale,
UChar **buff_dest, UChar *buff_source, int32_t len_source)
{
UErrorCode status;
@@ -1489,7 +1489,7 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
*buff_dest = palloc(len_dest * sizeof(**buff_dest));
status = U_ZERO_ERROR;
len_dest = func(*buff_dest, len_dest, buff_source, len_source,
- mylocale->info.icu.locale, &status);
+ locale, &status);
if (status == U_BUFFER_OVERFLOW_ERROR)
{
/* try again with adjusted length */
@@ -1497,7 +1497,7 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale,
*buff_dest = palloc(len_dest * sizeof(**buff_dest));
status = U_ZERO_ERROR;
len_dest = func(*buff_dest, len_dest, buff_source, len_source,
- mylocale->info.icu.locale, &status);
+ locale, &status);
}
if (U_FAILURE(status))
ereport(ERROR,
@@ -1555,8 +1555,15 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1570,25 +1577,43 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar;
int32_t len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToLower, mylocale,
+ len_conv = icu_convert_case(u_strToLower, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
@@ -1677,8 +1702,15 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1692,25 +1724,43 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar,
len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToUpper, mylocale,
+ len_conv = icu_convert_case(u_strToUpper, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
@@ -1800,8 +1850,15 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
else
{
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1815,25 +1872,43 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
-#ifdef USE_ICU
- if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
+#ifdef USE_ICU
int32_t len_uchar,
len_conv;
UChar *buff_uchar;
UChar *buff_conv;
+ const char *locale;
+
+ if (mylocale)
+ locale = mylocale->info.icu.locale;
+ else
+ locale = get_icu_default_collate();
len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
- len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale,
+ len_conv = icu_convert_case(u_strToTitle_default_BI, locale,
&buff_conv, buff_uchar, len_uchar);
icu_from_uchar(&result, buff_conv, len_conv);
pfree(buff_uchar);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
{
+ /* use_libc */
+
if (pg_database_encoding_max_length() > 1)
{
wchar_t *workspace;
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index 853c9c01e9..c5f98509c5 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -167,6 +167,9 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
plen;
pg_locale_t locale = 0;
bool locale_is_c = false;
+ char collprovider = COLLPROVIDER_LIBC;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY;
+ bool use_icu;
if (lc_ctype_is_c(collation))
locale_is_c = true;
@@ -184,7 +187,18 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collation);
+ collprovider = locale->provider;
}
+ else
+ {
+ collprovider = get_default_collprovider();
+ }
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
/*
* For efficiency reasons, in the single byte case we don't call lower()
@@ -194,7 +208,7 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
* way.
*/
- if (pg_database_encoding_max_length() > 1 || (locale && locale->provider == COLLPROVIDER_ICU))
+ if (pg_database_encoding_max_length() > 1 || use_icu)
{
/* lower's result is never packed, so OK to use old macros here */
pat = DatumGetTextPP(DirectFunctionCall1Coll(lower, collation,
diff --git a/src/backend/utils/adt/like_support.c b/src/backend/utils/adt/like_support.c
index 69509811ef..2087478704 100644
--- a/src/backend/utils/adt/like_support.c
+++ b/src/backend/utils/adt/like_support.c
@@ -97,7 +97,7 @@ static Selectivity regex_selectivity(const char *patt, int pattlen,
bool case_insensitive,
int fixed_prefix_len);
static int pattern_char_isalpha(char c, bool is_multibyte,
- pg_locale_t locale, bool locale_is_c);
+ pg_locale_t locale, char collprovider, bool locale_is_c);
static Const *make_greater_string(const Const *str_const, FmgrInfo *ltproc,
Oid collation);
static Datum string_to_datum(const char *str, Oid datatype);
@@ -921,6 +921,7 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
bool is_multibyte = (pg_database_encoding_max_length() > 1);
pg_locale_t locale = 0;
bool locale_is_c = false;
+ char collprovider = COLLPROVIDER_LIBC;
/* the right-hand const is type text or bytea */
Assert(typeid == BYTEAOID || typeid == TEXTOID);
@@ -949,6 +950,9 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collation);
+ collprovider = locale->provider;
+ } else {
+ collprovider = get_default_collprovider();
}
}
@@ -986,7 +990,8 @@ like_fixed_prefix(Const *patt_const, bool case_insensitive, Oid collation,
/* Stop if case-varying character (it's sort of a wildcard) */
if (case_insensitive &&
- pattern_char_isalpha(patt[pos], is_multibyte, locale, locale_is_c))
+ pattern_char_isalpha(patt[pos], is_multibyte, locale,
+ collprovider, locale_is_c))
break;
match[match_pos++] = patt[pos];
@@ -1426,13 +1431,14 @@ regex_selectivity(const char *patt, int pattlen, bool case_insensitive,
*/
static int
pattern_char_isalpha(char c, bool is_multibyte,
- pg_locale_t locale, bool locale_is_c)
+ pg_locale_t locale, char collprovider, bool locale_is_c)
{
if (locale_is_c)
return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
else if (is_multibyte && IS_HIGHBIT_SET(c))
return true;
- else if (locale && locale->provider == COLLPROVIDER_ICU)
+ else if (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII)
return IS_HIGHBIT_SET(c) ? true : false;
#ifdef HAVE_LOCALE_T
else if (locale && locale->provider == COLLPROVIDER_LIBC)
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 7fe10e284a..3b2856a77d 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -56,7 +56,10 @@
#include "access/htup_details.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_control.h"
+#include "catalog/pg_database.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
+#include "miscadmin.h"
#include "utils/builtins.h"
#include "utils/formatting.h"
#include "utils/hsearch.h"
@@ -134,6 +137,7 @@ static char *IsoLocaleName(const char *); /* MSVC specific */
#endif
#ifdef USE_ICU
+static char *check_icu_locale(const char *locale);
static void icu_set_collation_attributes(UCollator *collator, const char *loc);
#endif
@@ -150,13 +154,45 @@ static void icu_set_collation_attributes(UCollator *collator, const char *loc);
* also be unset to fully ensure that, but that has to be done elsewhere after
* all the individual LC_XXX variables have been set correctly. (Thank you
* Perl for making this kluge necessary.)
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
-char *
-pg_perm_setlocale(int category, const char *locale)
+const char *
+pg_perm_setlocale(int category, const char *locale, char collprovider)
{
- char *result;
+ const char *result;
const char *envvar;
char *envbuf;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
+ {
+#ifdef USE_ICU
+ UErrorCode status = U_ZERO_ERROR;
+ char *icu_locale = check_icu_locale(locale);
+
+ if (icu_locale == NULL && locale != NULL)
+ return NULL; /* fall out immediately on failure */
+
+ uloc_setDefault(icu_locale, &status);
+ if (U_FAILURE(status))
+ return NULL; /* fall out immediately on failure */
+
+ result = uloc_getDefault();
+ if (icu_locale)
+ pfree(icu_locale);
+ return result;
+#else /* not USE_ICU */
+ return NULL; /* fall out immediately on failure */
+#endif /* not USE_ICU */
+ }
+
+ /* use libc */
#ifndef WIN32
result = setlocale(category, locale);
@@ -171,7 +207,7 @@ pg_perm_setlocale(int category, const char *locale)
#ifdef LC_MESSAGES
if (category == LC_MESSAGES)
{
- result = (char *) locale;
+ result = locale;
if (locale == NULL || locale[0] == '\0')
return result;
}
@@ -222,7 +258,7 @@ pg_perm_setlocale(int category, const char *locale)
#ifdef WIN32
result = IsoLocaleName(locale);
if (result == NULL)
- result = (char *) locale;
+ result = locale;
#endif /* WIN32 */
break;
#endif /* LC_MESSAGES */
@@ -263,34 +299,102 @@ pg_perm_setlocale(int category, const char *locale)
* it seems that on most implementations that's the only thing it's good for;
* we could wish that setlocale gave back a canonically spelled version of
* the locale name, but typically it doesn't.)
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
bool
-check_locale(int category, const char *locale, char **canonname)
+check_locale(int category, const char *locale, char **canonname,
+ char collprovider)
{
- char *save;
- char *res;
+ const char *save;
+ const char *res;
+ char *save_dup;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+#ifdef USE_ICU
+ UErrorCode status;
+ char *icu_locale;
+#endif
+
+ Assert(use_libc || use_icu);
if (canonname)
*canonname = NULL; /* in case of failure */
- save = setlocale(category, NULL);
- if (!save)
- return false; /* won't happen, we hope */
+#ifndef USE_ICU
+ /* cannot use icu functions */
+ if (use_icu)
+ return false;
+#endif
+
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ save = uloc_getDefault();
+ if (!save)
+ return false; /* won't happen, we hope */
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ save = setlocale(category, NULL);
+ if (!save)
+ return false; /* won't happen, we hope */
+ }
/* save may be pointing at a modifiable scratch variable, see above. */
- save = pstrdup(save);
+ save_dup = pstrdup(save);
/* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ icu_locale = check_icu_locale(locale);
+
+ if (icu_locale == NULL && locale != NULL)
+ return false; /* won't happen, we hope */
+
+ status = U_ZERO_ERROR;
+ uloc_setDefault(icu_locale, &status);
+ if (U_FAILURE(status))
+ return false; /* won't happen, we hope */
+
+ res = uloc_getDefault();
+ if (icu_locale)
+ pfree(icu_locale);
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ res = setlocale(category, locale);
+ }
/* save canonical name if requested. */
if (res && canonname)
*canonname = pstrdup(res);
/* restore old value. */
- if (!setlocale(category, save))
- elog(WARNING, "failed to restore old locale \"%s\"", save);
- pfree(save);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ elog(WARNING, "ICU error: failed to restore old locale \"%s\"",
+ save_dup);
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ if (!setlocale(category, save_dup))
+ elog(WARNING, "failed to restore old locale \"%s\"", save_dup);
+ }
+ pfree(save_dup);
return (res != NULL);
}
@@ -310,7 +414,7 @@ check_locale(int category, const char *locale, char **canonname)
bool
check_locale_monetary(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_MONETARY, *newval, NULL);
+ return check_locale(LC_MONETARY, *newval, NULL, '\0');
}
void
@@ -322,7 +426,7 @@ assign_locale_monetary(const char *newval, void *extra)
bool
check_locale_numeric(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_NUMERIC, *newval, NULL);
+ return check_locale(LC_NUMERIC, *newval, NULL, '\0');
}
void
@@ -334,7 +438,7 @@ assign_locale_numeric(const char *newval, void *extra)
bool
check_locale_time(char **newval, void **extra, GucSource source)
{
- return check_locale(LC_TIME, *newval, NULL);
+ return check_locale(LC_TIME, *newval, NULL, '\0');
}
void
@@ -370,7 +474,7 @@ check_locale_messages(char **newval, void **extra, GucSource source)
* On Windows, we can't even check the value, so accept blindly
*/
#if defined(LC_MESSAGES) && !defined(WIN32)
- return check_locale(LC_MESSAGES, *newval, NULL);
+ return check_locale(LC_MESSAGES, *newval, NULL, '\0');
#else
return true;
#endif
@@ -384,7 +488,7 @@ assign_locale_messages(const char *newval, void *extra)
* We ignore failure, as per comment above.
*/
#ifdef LC_MESSAGES
- (void) pg_perm_setlocale(LC_MESSAGES, newval);
+ (void) pg_perm_setlocale(LC_MESSAGES, newval, '\0');
#endif
}
@@ -1100,21 +1204,14 @@ lookup_collation_cache(Oid collation, bool set_flags)
/* Attempt to set the flags */
HeapTuple tp;
Form_pg_collation collform;
- const char *collcollate;
- const char *collctype;
tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collation));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for collation %u", collation);
collform = (Form_pg_collation) GETSTRUCT(tp);
- collcollate = NameStr(collform->collcollate);
- collctype = NameStr(collform->collctype);
-
- cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
- (strcmp(collcollate, "POSIX") == 0));
- cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
- (strcmp(collctype, "POSIX") == 0));
+ cache_entry->collate_is_c = locale_is_c(NameStr(collform->collcollate));
+ cache_entry->ctype_is_c = locale_is_c(NameStr(collform->collctype));
cache_entry->flags_valid = true;
@@ -1145,20 +1242,28 @@ lc_collate_is_c(Oid collation)
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
+ char collprovider;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_COLLATE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_COLLATE setting");
-
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
+
+ collprovider = get_default_collprovider();
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
result = false;
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ char *localeptr = setlocale(LC_COLLATE, NULL);
+
+ if (!localeptr)
+ elog(ERROR, "invalid LC_COLLATE setting");
+
+ result = locale_is_c(localeptr);
+ }
return (bool) result;
}
@@ -1195,20 +1300,28 @@ lc_ctype_is_c(Oid collation)
if (collation == DEFAULT_COLLATION_OID)
{
static int result = -1;
- char *localeptr;
+ char collprovider;
if (result >= 0)
return (bool) result;
- localeptr = setlocale(LC_CTYPE, NULL);
- if (!localeptr)
- elog(ERROR, "invalid LC_CTYPE setting");
-
- if (strcmp(localeptr, "C") == 0)
- result = true;
- else if (strcmp(localeptr, "POSIX") == 0)
- result = true;
- else
+
+ collprovider = get_default_collprovider();
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
result = false;
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ char *localeptr = setlocale(LC_CTYPE, NULL);
+
+ if (!localeptr)
+ elog(ERROR, "invalid LC_CTYPE setting");
+
+ result = locale_is_c(localeptr);
+ }
return (bool) result;
}
@@ -1390,7 +1503,7 @@ pg_newlocale_from_collation(Oid collid)
/* We will leak this string if we get an error below :-( */
result.info.icu.locale = MemoryContextStrdup(TopMemoryContext,
collcollate);
- result.info.icu.ucol = collator;
+ result.info.icu.ucol = open_collator(collcollate);
#else /* not USE_ICU */
/* could get here if a collation was created by a build with ICU */
ereport(ERROR,
@@ -1447,46 +1560,6 @@ pg_newlocale_from_collation(Oid collid)
return cache_entry->locale;
}
-/*
- * Get provider-specific collation version string for the given collation from
- * the operating system/library.
- *
- * A particular provider must always either return a non-NULL string or return
- * NULL (if it doesn't support versions). It must not return NULL for some
- * collcollate and not NULL for others.
- */
-char *
-get_collation_actual_version(char collprovider, const char *collcollate)
-{
- char *collversion;
-
-#ifdef USE_ICU
- if (collprovider == COLLPROVIDER_ICU)
- {
- UCollator *collator;
- UErrorCode status;
- UVersionInfo versioninfo;
- char buf[U_MAX_VERSION_STRING_LENGTH];
-
- status = U_ZERO_ERROR;
- collator = ucol_open(collcollate, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not open collator for locale \"%s\": %s",
- collcollate, u_errorName(status))));
- ucol_getVersion(collator, versioninfo);
- ucol_close(collator);
-
- u_versionToString(versioninfo, buf);
- collversion = pstrdup(buf);
- }
- else
-#endif
- collversion = NULL;
-
- return collversion;
-}
-
#ifdef USE_ICU
/*
@@ -1867,3 +1940,125 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
return result;
}
+
+#ifdef USE_ICU
+/*
+ * If locale is "" return the environment value from setlocale().
+ *
+ * Otherwise return a malloc'd copy of locale if it is not NULL.
+ */
+static char *
+check_icu_locale(const char *locale)
+{
+ char *canonname = NULL;
+ char *winlocale = NULL;
+ char *result;
+
+ /* Windows locales can be in the format ".codepage" */
+ if (locale && (strlen(locale) == 0 || locale[0] == '.'))
+ {
+ check_locale(LC_COLLATE, locale, &canonname, COLLPROVIDER_LIBC);
+ locale = (const char *) canonname;
+ }
+
+#ifdef WIN32
+ if (!locale_is_c(locale))
+ {
+ winlocale = check_icu_winlocale(locale);
+ locale = (const char *) winlocale;
+ }
+#endif
+
+ result = locale ? pstrdup(locale) : NULL;
+
+ if (canonname)
+ pfree(canonname);
+ if (winlocale)
+ pfree(winlocale);
+
+ return result;
+}
+
+/*
+ * Get the default icu collation.
+ */
+const char *
+get_icu_default_collate(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static char result[NAMEDATALEN];
+ static bool cached = false;
+ const char *locale,
+ *collate;
+ char *langtag;
+
+ if (cached)
+ return result;
+
+ locale = uloc_getDefault();
+ if (!locale)
+ ereport(ERROR, (errmsg("ICU error: uloc_getDefault() failed")));
+
+ langtag = get_icu_language_tag(locale);
+ collate = get_icu_collate(locale, langtag);
+
+ if (strlen(collate) >= NAMEDATALEN)
+ ereport(FATAL,
+ (errmsg("the default ICU collation name \"%s\" is too long", collate)));
+
+ strcpy(result, collate);
+ cached = true;
+
+ pfree(langtag);
+ return result;
+}
+
+/*
+ * Get the collator for the default ICU collation.
+ */
+UCollator *
+get_default_collation_collator(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static UCollator *collator = NULL;
+
+ if (collator)
+ return collator;
+
+ collator = open_collator(get_icu_default_collate());
+ return collator;
+}
+#endif /* USE_ICU */
+
+/*
+ * Get the default collation provider.
+ */
+char
+get_default_collprovider(void)
+{
+ /* Cache the result so we only have to compute it once. */
+ static char result = '\0';
+ HeapTuple tp;
+ Form_pg_database dbform;
+ char *datcollate;
+
+ if (result)
+ return result;
+
+ tp = SearchSysCache1(DATABASEOID, ObjectIdGetDatum(MyDatabaseId));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for database %u", MyDatabaseId);
+
+ dbform = (Form_pg_database) GETSTRUCT(tp);
+ datcollate = NameStr(dbform->datcollate);
+ check_locale_collprovider(datcollate, NULL, &result, NULL);
+
+ if (!is_valid_nondefault_collprovider(result))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of database \"%s\"",
+ datcollate, NameStr(dbform->datname))));
+
+ ReleaseSysCache(tp);
+ return result;
+}
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 39c394331b..5864c991eb 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1460,8 +1460,15 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
char *a1p,
*a2p;
pg_locale_t mylocale = 0;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1475,8 +1482,15 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
mylocale = pg_newlocale_from_collation(collid);
+ collprovider = mylocale->provider;
}
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
/*
* memcmp() can't tell us which of two unequal strings sorts first,
* but it's a cheap way to tell if they're equal. Testing shows that
@@ -1491,8 +1505,7 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
#ifdef WIN32
/* Win32 does not have UTF-8, so we need to map to UTF-16 */
- if (GetDatabaseEncoding() == PG_UTF8
- && (!mylocale || mylocale->provider == COLLPROVIDER_LIBC))
+ if (GetDatabaseEncoding() == PG_UTF8 && use_libc)
{
int a1len;
int a2len;
@@ -1594,60 +1607,67 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid)
memcpy(a2p, arg2, len2);
a2p[len2] = '\0';
- if (mylocale)
+ if (use_icu)
{
- if (mylocale->provider == COLLPROVIDER_ICU)
- {
#ifdef USE_ICU
+ UCollator *collator;
+
+ if (mylocale)
+ collator = mylocale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
+
#ifdef HAVE_UCOL_STRCOLLUTF8
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- UErrorCode status;
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ UErrorCode status;
- status = U_ZERO_ERROR;
- result = ucol_strcollUTF8(mylocale->info.icu.ucol,
- arg1, len1,
- arg2, len2,
- &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("collation failed: %s", u_errorName(status))));
- }
- else
+ status = U_ZERO_ERROR;
+ result = ucol_strcollUTF8(collator,
+ arg1, len1,
+ arg2, len2,
+ &status);
+ if (U_FAILURE(status))
+ ereport(ERROR,
+ (errmsg("collation failed: %s", u_errorName(status))));
+ }
+ else
#endif
- {
- int32_t ulen1,
- ulen2;
- UChar *uchar1,
- *uchar2;
+ {
+ int32_t ulen1,
+ ulen2;
+ UChar *uchar1,
+ *uchar2;
- ulen1 = icu_to_uchar(&uchar1, arg1, len1);
- ulen2 = icu_to_uchar(&uchar2, arg2, len2);
+ ulen1 = icu_to_uchar(&uchar1, arg1, len1);
+ ulen2 = icu_to_uchar(&uchar2, arg2, len2);
- result = ucol_strcoll(mylocale->info.icu.ucol,
- uchar1, ulen1,
- uchar2, ulen2);
+ result = ucol_strcoll(collator,
+ uchar1, ulen1,
+ uchar2, ulen2);
- pfree(uchar1);
- pfree(uchar2);
- }
+ pfree(uchar1);
+ pfree(uchar2);
+ }
#else /* not USE_ICU */
- /* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif /* not USE_ICU */
- }
- else
- {
+ }
+ else
+ {
+ /* use_libc */
+
+ if (mylocale)
#ifdef HAVE_LOCALE_T
result = strcoll_l(a1p, a2p, mylocale->info.lt);
#else
/* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- }
+ else
+ result = strcoll(a1p, a2p);
}
- else
- result = strcoll(a1p, a2p);
/*
* In some locales strcoll() can claim that nonidentical strings are
@@ -1897,6 +1917,9 @@ varstr_sortsupport(SortSupport ssup, Oid typid, Oid collid)
bool collate_c = false;
VarStringSortSupport *sss;
pg_locale_t locale = 0;
+ char collprovider = '\0';
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY = false;
+ bool use_icu = false;
/*
* If possible, set ssup->comparator to a function which can be used to
@@ -1933,7 +1956,11 @@ varstr_sortsupport(SortSupport ssup, Oid typid, Oid collid)
* we'll figure out the collation based on the locale id and cache the
* result.
*/
- if (collid != DEFAULT_COLLATION_OID)
+ if (collid == DEFAULT_COLLATION_OID)
+ {
+ collprovider = get_default_collprovider();
+ }
+ else
{
if (!OidIsValid(collid))
{
@@ -1947,8 +1974,15 @@ varstr_sortsupport(SortSupport ssup, Oid typid, Oid collid)
errhint("Use the COLLATE clause to set the collation explicitly.")));
}
locale = pg_newlocale_from_collation(collid);
+ collprovider = locale->provider;
}
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
/*
* There is a further exception on Windows. When the database
* encoding is UTF-8 and we are not using the C collation, complex
@@ -1958,8 +1992,7 @@ varstr_sortsupport(SortSupport ssup, Oid typid, Oid collid)
* trampoline. ICU locales work just the same on Windows, however.
*/
#ifdef WIN32
- if (GetDatabaseEncoding() == PG_UTF8 &&
- !(locale && locale->provider == COLLPROVIDER_ICU))
+ if (GetDatabaseEncoding() == PG_UTF8 && use_libc)
return;
#endif
@@ -1998,7 +2031,7 @@ varstr_sortsupport(SortSupport ssup, Oid typid, Oid collid)
* platforms.
*/
#ifndef TRUST_STRXFRM
- if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
+ if (!collate_c && !use_icu)
abbreviate = false;
#endif
@@ -2194,6 +2227,9 @@ static int
varstrfastcmp_locale(char *a1p, int len1, char *a2p, int len2, SortSupport ssup)
{
VarStringSortSupport *sss = (VarStringSortSupport *) ssup->ssup_extra;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
int result;
bool arg1_match;
@@ -2274,59 +2310,77 @@ varstrfastcmp_locale(char *a1p, int len1, char *a2p, int len2, SortSupport ssup)
}
if (sss->locale)
+ collprovider = sss->locale->provider;
+ else
+ collprovider = get_default_collprovider();
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
+ if (use_icu)
{
- if (sss->locale->provider == COLLPROVIDER_ICU)
- {
#ifdef USE_ICU
-#ifdef HAVE_UCOL_STRCOLLUTF8
- if (GetDatabaseEncoding() == PG_UTF8)
- {
- UErrorCode status;
+ UCollator *collator;
- status = U_ZERO_ERROR;
- result = ucol_strcollUTF8(sss->locale->info.icu.ucol,
- a1p, len1,
- a2p, len2,
- &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("collation failed: %s", u_errorName(status))));
- }
- else
+ if (sss->locale)
+ collator = sss->locale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
+
+#ifdef HAVE_UCOL_STRCOLLUTF8
+ if (GetDatabaseEncoding() == PG_UTF8)
+ {
+ UErrorCode status;
+
+ status = U_ZERO_ERROR;
+ result = ucol_strcollUTF8(collator,
+ a1p, len1,
+ a2p, len2,
+ &status);
+ if (U_FAILURE(status))
+ ereport(ERROR,
+ (errmsg("collation failed: %s", u_errorName(status))));
+ }
+ else
#endif
- {
- int32_t ulen1,
- ulen2;
- UChar *uchar1,
- *uchar2;
+ {
+ int32_t ulen1,
+ ulen2;
+ UChar *uchar1,
+ *uchar2;
- ulen1 = icu_to_uchar(&uchar1, a1p, len1);
- ulen2 = icu_to_uchar(&uchar2, a2p, len2);
+ ulen1 = icu_to_uchar(&uchar1, a1p, len1);
+ ulen2 = icu_to_uchar(&uchar2, a2p, len2);
- result = ucol_strcoll(sss->locale->info.icu.ucol,
- uchar1, ulen1,
- uchar2, ulen2);
+ result = ucol_strcoll(collator,
+ uchar1, ulen1,
+ uchar2, ulen2);
- pfree(uchar1);
- pfree(uchar2);
- }
+ pfree(uchar1);
+ pfree(uchar2);
+ }
#else /* not USE_ICU */
- /* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif /* not USE_ICU */
- }
- else
- {
+ }
+ else
+ {
+ /* use_libc */
+
+ if (sss->locale)
#ifdef HAVE_LOCALE_T
result = strcoll_l(sss->buf1, sss->buf2, sss->locale->info.lt);
#else
/* shouldn't happen */
- elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
#endif
- }
+ else
+ result = strcoll(sss->buf1, sss->buf2);
}
- else
- result = strcoll(sss->buf1, sss->buf2);
/*
* In some locales strcoll() can claim that nonidentical strings are
@@ -2424,6 +2478,9 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
else
{
Size bsize;
+ char collprovider;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY,
+ use_icu;
#ifdef USE_ICU
int32_t ulen = -1;
UChar *uchar = NULL;
@@ -2460,10 +2517,20 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
sss->buf1[len] = '\0';
sss->last_len1 = len;
+ if (sss->locale)
+ collprovider = sss->locale->provider;
+ else
+ collprovider = get_default_collprovider();
+
+ use_icu = (collprovider == COLLPROVIDER_ICU &&
+ GetDatabaseEncoding() != PG_SQL_ASCII);
+ use_libc = (collprovider == COLLPROVIDER_LIBC ||
+ GetDatabaseEncoding() == PG_SQL_ASCII);
+ Assert(use_libc || use_icu);
+
#ifdef USE_ICU
/* When using ICU and not UTF8, convert string to UChar. */
- if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU &&
- GetDatabaseEncoding() != PG_UTF8)
+ if (use_icu && GetDatabaseEncoding() != PG_UTF8)
ulen = icu_to_uchar(&uchar, sss->buf1, len);
#endif
@@ -2477,9 +2544,15 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
*/
for (;;)
{
-#ifdef USE_ICU
- if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+ if (use_icu)
{
+#ifdef USE_ICU
+ UCollator *collator;
+
+ if (sss->locale)
+ collator = sss->locale->info.icu.ucol;
+ else
+ collator = get_default_collation_collator();
/*
* When using UTF8, use the iteration interface so we only
* need to produce as many bytes as we actually need.
@@ -2493,7 +2566,7 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
uiter_setUTF8(&iter, sss->buf1, len);
state[0] = state[1] = 0; /* won't need that again */
status = U_ZERO_ERROR;
- bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol,
+ bsize = ucol_nextSortKeyPart(collator,
&iter,
state,
(uint8_t *) sss->buf2,
@@ -2505,19 +2578,26 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
u_errorName(status))));
}
else
- bsize = ucol_getSortKey(sss->locale->info.icu.ucol,
+ bsize = ucol_getSortKey(collator,
uchar, ulen,
(uint8_t *) sss->buf2, sss->buflen2);
+#else /* not USE_ICU */
+ /* shouldn't happen */
+ elog(ERROR, "unsupported collprovider: %c", collprovider);
+#endif /* not USE_ICU */
}
else
-#endif
+ {
+ /* use_libc */
+
#ifdef HAVE_LOCALE_T
- if (sss->locale && sss->locale->provider == COLLPROVIDER_LIBC)
- bsize = strxfrm_l(sss->buf2, sss->buf1,
- sss->buflen2, sss->locale->info.lt);
- else
+ if (sss->locale)
+ bsize = strxfrm_l(sss->buf2, sss->buf1,
+ sss->buflen2, sss->locale->info.lt);
+ else
#endif
- bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
+ bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
+ }
sss->last_len2 = bsize;
if (bsize < sss->buflen2)
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 752010ed27..ea5582b784 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -31,9 +31,11 @@
#include "catalog/indexing.h"
#include "catalog/namespace.h"
#include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
#include "catalog/pg_db_role_setting.h"
#include "catalog/pg_tablespace.h"
+#include "common/pg_collation_fn_common.h"
#include "libpq/auth.h"
#include "libpq/libpq-be.h"
#include "mb/pg_wchar.h"
@@ -319,6 +321,13 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
Form_pg_database dbform;
char *collate;
char *ctype;
+ char *datcollate;
+ char collprovider;
+ char *collversion;
+ char *wincollate = NULL;
+ char *langtag = NULL;
+ const char *collcollate;
+ char *actual_versionstr;
/* Fetch our pg_database row normally, via syscache */
tup = SearchSysCache1(DATABASEOID, ObjectIdGetDatum(MyDatabaseId));
@@ -400,27 +409,124 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
PGC_BACKEND, PGC_S_DYNAMIC_DEFAULT);
/* assign locale variables */
- collate = NameStr(dbform->datcollate);
ctype = NameStr(dbform->datctype);
+ datcollate = NameStr(dbform->datcollate);
+ check_locale_collprovider(datcollate, &collate, &collprovider,
+ &collversion);
- if (pg_perm_setlocale(LC_COLLATE, collate) == NULL)
+ if (!is_valid_nondefault_collprovider(collprovider))
+ /* This could happen when manually creating a mess in the catalogs. */
+ ereport(FATAL,
+ (errmsg("could not find out the collation provider for datcollate \"%s\" of database \"%s\"",
+ datcollate, name)));
+
+#ifndef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ ereport(FATAL,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"), \
+ errhint("Recreate the database with libc locale or rebuild PostgreSQL using --with-icu.")));
+#endif
+
+ /* we always check lc_collate for libc */
+ if (pg_perm_setlocale(LC_COLLATE, collate, COLLPROVIDER_LIBC) == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
- errdetail("The database was initialized with LC_COLLATE \"%s\", "
- " which is not recognized by setlocale().", collate),
+ errdetail("The database was initialized with LC_COLLATE \"%s\" (provider \"%s\"), "
+ " which is not recognized by setlocale().",
+ collate, get_collprovider_name(COLLPROVIDER_LIBC)),
errhint("Recreate the database with another locale or install the missing locale.")));
- if (pg_perm_setlocale(LC_CTYPE, ctype) == NULL)
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ if (pg_perm_setlocale(LC_COLLATE, collate, collprovider) == NULL)
+ ereport(FATAL,
+ (errmsg("database locale is incompatible with operating system"),
+ errdetail("The database was initialized with LC_COLLATE \"%s\" (provider \"%s\"), "
+ " which is not recognized by uloc_setDefault().",
+ collate, get_collprovider_name(collprovider)),
+ errhint("Recreate the database with another locale or install the missing locale.")));
+
+ /* This could happen when manually creating a mess in the catalogs. */
+ if (strcmp(collate, ctype) != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("collations with different collate and ctype values are not supported by ICU")));
+ }
+
+ if (pg_perm_setlocale(LC_CTYPE, ctype, '\0') == NULL)
ereport(FATAL,
(errmsg("database locale is incompatible with operating system"),
errdetail("The database was initialized with LC_CTYPE \"%s\", "
" which is not recognized by setlocale().", ctype),
errhint("Recreate the database with another locale or install the missing locale.")));
+ /* get the actual version of the collation */
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ collcollate = (const char *) collate;
+#ifdef WIN32
+ if (!locale_is_c(collcollate))
+ {
+ wincollate = check_icu_winlocale(collcollate);
+ collcollate = (const char *) wincollate;
+ }
+#endif /* WIN32 */
+ langtag = get_icu_language_tag(collcollate);
+ collcollate = get_icu_collate(collcollate, langtag);
+ }
+ else
+#endif /* USE_ICU */
+ {
+ /* COLLPROVIDER_LIBC */
+ collcollate = (const char *) collate;
+ }
+
+ actual_versionstr = get_collation_actual_version(collprovider, collcollate);
+
+ /*
+ * Check the collation version (this matches the version checking in the
+ * function pg_newlocale_from_collation())
+ */
+ if (collversion)
+ {
+ if (!actual_versionstr)
+ {
+ /*
+ * This could happen when manually creating a mess in the catalogs.
+ */
+ ereport(ERROR,
+ (errmsg("collation \"%s\" (provider \"%s\") has no actual version, but a version was specified",
+ collate, get_collprovider_name(collprovider))));
+ }
+
+ if (strcmp(actual_versionstr, collversion) != 0)
+ ereport(ERROR,
+ (errmsg("collation \"%s\" (provider \"%s\") has version mismatch",
+ collate, get_collprovider_name(collprovider)),
+ errdetail("The collation in the database was created using version %s, "
+ "but the operating system provides version %s.",
+ collversion, actual_versionstr),
+ errhint("Build PostgreSQL with the right library version.")));
+ }
+
/* Make the locale settings visible as GUC variables, too */
- SetConfigOption("lc_collate", collate, PGC_INTERNAL, PGC_S_OVERRIDE);
+ SetConfigOption("lc_collate", datcollate, PGC_INTERNAL, PGC_S_OVERRIDE);
SetConfigOption("lc_ctype", ctype, PGC_INTERNAL, PGC_S_OVERRIDE);
+ pfree(collate);
+ if (collversion)
+ pfree(collversion);
+ if (langtag)
+ pfree(langtag);
+ if (actual_versionstr)
+ pfree(actual_versionstr);
+ if (wincollate)
+ pfree(wincollate);
+
check_strxfrm_bug();
ReleaseSysCache(tup);
diff --git a/src/backend/utils/mb/encnames.c b/src/backend/utils/mb/encnames.c
index 12b61cd3db..1e75257651 100644
--- a/src/backend/utils/mb/encnames.c
+++ b/src/backend/utils/mb/encnames.c
@@ -403,8 +403,6 @@ const pg_enc2gettext pg_enc2gettext_tbl[] =
};
-#ifndef FRONTEND
-
/*
* Table of encoding names for ICU
*
@@ -457,6 +455,7 @@ is_encoding_supported_by_icu(int encoding)
return (pg_enc2icu_tbl[encoding] != NULL);
}
+#ifndef FRONTEND
const char *
get_encoding_name_for_icu(int encoding)
{
@@ -475,7 +474,6 @@ get_encoding_name_for_icu(int encoding)
return icu_encoding_name;
}
-
#endif /* not FRONTEND */
diff --git a/src/bin/initdb/Makefile b/src/bin/initdb/Makefile
index 7c404430a9..d99cc2815d 100644
--- a/src/bin/initdb/Makefile
+++ b/src/bin/initdb/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) -I$(top_srcdir)/src/timezone $(CPPFLAGS)
# note: we need libpq only because fe_utils does
-LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
# use system timezone data?
ifneq (,$(with_system_tzdata))
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index fd50a809ea..c96fd8d516 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -55,6 +55,10 @@
#include <signal.h>
#include <time.h>
+#ifdef USE_ICU
+#include <unicode/uloc.h>
+#endif
+
#ifdef HAVE_SHM_OPEN
#include "sys/mman.h"
#endif
@@ -65,6 +69,7 @@
#include "catalog/pg_collation_d.h"
#include "common/file_perm.h"
#include "common/file_utils.h"
+#include "common/pg_collation_fn_common.h"
#include "common/restricted_token.h"
#include "common/username.h"
#include "fe_utils/string_utils.h"
@@ -144,6 +149,8 @@ static bool data_checksums = false;
static char *xlog_dir = NULL;
static char *str_wal_segment_size_mb = NULL;
static int wal_segment_size_mb;
+static char collprovider = '\0';
+static char *collversion = NULL;
/* internal vars */
@@ -267,10 +274,15 @@ static char *escape_quotes(const char *src);
static char *escape_quotes_bki(const char *src);
static int locale_date_order(const char *locale);
static void check_locale_name(int category, const char *locale,
- char **canonname);
-static bool check_locale_encoding(const char *locale, int encoding);
+ char **canonname, char collprovider);
+static bool check_locale_encoding(const char *locale, int encoding,
+ char collprovider);
static void setlocales(void);
static void usage(const char *progname);
+#ifdef USE_ICU
+static char *check_icu_locale_name(const char *locale);
+#endif
+static void set_collation_version(void);
void setup_pgdata(void);
void setup_bin_paths(const char *argv0);
void setup_data_file_paths(void);
@@ -1406,10 +1418,27 @@ bootstrap_template1(void)
char **bki_lines;
char headerline[MAXPGPATH];
char buf[64];
+ char *lc_collate_full_name;
printf(_("running bootstrap script ... "));
fflush(stdout);
+ Assert(lc_collate);
+
+ lc_collate_full_name = get_full_collation_name(lc_collate, collprovider,
+ collversion);
+
+ if (!lc_collate_full_name)
+ exit(1); /* get_full_collation_name printed the error */
+
+ if (strlen(lc_collate_full_name) >= NAMEDATALEN)
+ {
+ fprintf(stderr,
+ _("%s: the full collation name \"%s\" is too long\n"),
+ progname, lc_collate_full_name);
+ exit(1);
+ }
+
bki_lines = readfile(bki_file);
/* Check that bki file appears to be of the right version */
@@ -1451,7 +1480,7 @@ bootstrap_template1(void)
encodingid_to_string(encodingid));
bki_lines = replace_token(bki_lines, "LC_COLLATE",
- escape_quotes_bki(lc_collate));
+ escape_quotes_bki(lc_collate_full_name));
bki_lines = replace_token(bki_lines, "LC_CTYPE",
escape_quotes_bki(lc_ctype));
@@ -1493,6 +1522,7 @@ bootstrap_template1(void)
PG_CMD_CLOSE;
free(bki_lines);
+ free(lc_collate_full_name);
check_ok();
}
@@ -2224,53 +2254,143 @@ locale_date_order(const char *locale)
* the locale name, but typically it doesn't.)
*
* this should match the backend's check_locale() function
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
static void
-check_locale_name(int category, const char *locale, char **canonname)
+check_locale_name(int category, const char *locale, char **canonname,
+ char collprovider)
{
- char *save;
- char *res;
+ const char *save;
+ const char *res;
+ char *save_dup;
+ bool use_libc PG_USED_FOR_ASSERTS_ONLY =
+ category != LC_COLLATE || collprovider == COLLPROVIDER_LIBC;
+ bool use_icu =
+ category == LC_COLLATE && collprovider == COLLPROVIDER_ICU;
+ bool failure = false;
+#ifdef USE_ICU
+ UErrorCode status;
+ char *icu_locale;
+#endif
- if (canonname)
- *canonname = NULL; /* in case of failure */
+ Assert(use_libc || use_icu);
- save = setlocale(category, NULL);
- if (!save)
+#ifndef USE_ICU
+ if (use_icu)
{
- fprintf(stderr, _("%s: setlocale() failed\n"),
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
progname);
exit(1);
}
+#endif
+
+ if (canonname)
+ *canonname = NULL; /* in case of failure */
+
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ save = uloc_getDefault();
+ if (!save)
+ {
+ fprintf(stderr, _("%s: ICU error: uloc_getDefault() failed\n"),
+ progname);
+ exit(1);
+ }
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ save = setlocale(category, NULL);
+ if (!save)
+ {
+ fprintf(stderr, _("%s: setlocale() failed\n"),
+ progname);
+ exit(1);
+ }
+ }
/* save may be pointing at a modifiable scratch variable, so copy it. */
- save = pg_strdup(save);
+ save_dup = pg_strdup(save);
/* for setlocale() call */
if (!locale)
locale = "";
/* set the locale with setlocale, to see if it accepts it. */
- res = setlocale(category, locale);
+#ifdef USE_ICU
+ if (use_icu)
+ {
+ icu_locale = check_icu_locale_name(locale);
+ if (icu_locale == NULL && locale != NULL)
+ {
+ failure = true;
+ res = NULL;
+ }
+ else
+ {
+ status = U_ZERO_ERROR;
+ uloc_setDefault(icu_locale, &status);
+ res = uloc_getDefault();
+ failure = (U_FAILURE(status) || res == NULL);
+ if (icu_locale)
+ pfree(icu_locale);
+ }
+ }
+ else
+#endif
+ {
+ /* use_libc */
+ res = setlocale(category, locale);
+ failure = (res == NULL);
+ }
/* save canonical name if requested. */
if (res && canonname)
*canonname = pg_strdup(res);
/* restore old value. */
- if (!setlocale(category, save))
+#ifdef USE_ICU
+ if (use_icu)
{
- fprintf(stderr, _("%s: failed to restore old locale \"%s\"\n"),
- progname, save);
- exit(1);
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ {
+ fprintf(stderr, _("%s: ICU error: failed to restore old locale \"%s\"\n"),
+ progname, save_dup);
+ exit(1);
+ }
}
- free(save);
+ else
+#endif
+ {
+ /* use_libc */
+ if (!setlocale(category, save_dup))
+ {
+ fprintf(stderr, _("%s: failed to restore old locale \"%s\"\n"),
+ progname, save_dup);
+ exit(1);
+ }
+ }
+ free(save_dup);
/* complain if locale wasn't valid */
- if (res == NULL)
+ if (failure)
{
if (*locale)
- fprintf(stderr, _("%s: invalid locale name \"%s\"\n"),
- progname, locale);
+ {
+ if (category == LC_COLLATE)
+ fprintf(stderr, _("%s: invalid locale name \"%s\" (provider \"%s\")\n"),
+ progname, locale, get_collprovider_name(collprovider));
+ else
+ fprintf(stderr, _("%s: invalid locale name \"%s\"\n"),
+ progname, locale);
+ }
else
{
/*
@@ -2292,9 +2412,11 @@ check_locale_name(int category, const char *locale, char **canonname)
* check if the chosen encoding matches the encoding required by the locale
*
* this should match the similar check in the backend createdb() function
+ *
+ * Set collprovider to '\0' if category is not LC_COLLATE.
*/
static bool
-check_locale_encoding(const char *locale, int user_enc)
+check_locale_encoding(const char *locale, int user_enc, char collprovider)
{
int locale_enc;
@@ -2321,6 +2443,25 @@ check_locale_encoding(const char *locale, int user_enc)
progname);
return false;
}
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ if (!is_encoding_supported_by_icu(user_enc))
+ {
+ fprintf(stderr, _("%s: selected encoding (%s) is not supported for ICU locales\n"),
+ progname, pg_encoding_to_char(user_enc));
+ return false;
+ }
+#else /* not USE_ICU */
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
+ progname);
+ exit(1);
+#endif /* not USE_ICU */
+ }
+
return true;
}
@@ -2332,16 +2473,22 @@ check_locale_encoding(const char *locale, int user_enc)
static void
setlocales(void)
{
- char *canonname;
-
- /* set empty lc_* values to locale config if set */
+ char *canonname = NULL;
if (locale)
{
+ /*
+ * Set up the collation provider if possible and canonicalize the locale
+ * name.
+ */
+ check_locale_collprovider(locale, &canonname, &collprovider, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ locale = canonname;
+
+ /* set empty lc_* values to locale config if set */
if (!lc_ctype)
lc_ctype = locale;
- if (!lc_collate)
- lc_collate = locale;
if (!lc_numeric)
lc_numeric = locale;
if (!lc_time)
@@ -2352,29 +2499,83 @@ setlocales(void)
lc_messages = locale;
}
+ if (lc_collate)
+ {
+ /*
+ * Set up the collation provider if possible and canonicalize the locale
+ * name.
+ */
+ check_locale_collprovider(lc_collate, &canonname, &collprovider, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ lc_collate = canonname;
+ }
+ else if (canonname)
+ {
+ /* we have already canonicalized the locale name */
+ lc_collate = pstrdup(canonname);
+ }
+
/*
* canonicalize locale names, and obtain any missing values from our
* current environment
*/
- check_locale_name(LC_CTYPE, lc_ctype, &canonname);
+ check_locale_name(LC_CTYPE, lc_ctype, &canonname, '\0');
lc_ctype = canonname;
- check_locale_name(LC_COLLATE, lc_collate, &canonname);
+
+ /* we always check lc_collate for libc */
+ check_locale_name(LC_COLLATE, lc_collate, &canonname, COLLPROVIDER_LIBC);
+ if (lc_collate)
+ pfree(lc_collate);
lc_collate = canonname;
- check_locale_name(LC_NUMERIC, lc_numeric, &canonname);
+
+ /* determine the collation provider if we haven't already done it */
+ if (!is_valid_nondefault_collprovider(collprovider))
+ {
+#ifdef USE_ICU
+ if (!locale_is_c(lc_collate))
+ {
+ collprovider = COLLPROVIDER_ICU;
+ }
+ else
+#endif
+ {
+ collprovider = COLLPROVIDER_LIBC;
+ }
+ }
+
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ /* check lc_collate and lc_ctype for icu if we need it */
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ check_locale_name(LC_COLLATE, lc_collate, NULL, collprovider);
+ if (strcmp(lc_collate, lc_ctype) != 0)
+ {
+ fprintf(stderr,
+ _("%s: collations with different collate and ctype values are not supported by ICU\n"),
+ progname);
+ exit(1);
+ }
+ }
+
+ check_locale_name(LC_NUMERIC, lc_numeric, &canonname, '\0');
lc_numeric = canonname;
- check_locale_name(LC_TIME, lc_time, &canonname);
+ check_locale_name(LC_TIME, lc_time, &canonname, '\0');
lc_time = canonname;
- check_locale_name(LC_MONETARY, lc_monetary, &canonname);
+ check_locale_name(LC_MONETARY, lc_monetary, &canonname, '\0');
lc_monetary = canonname;
#if defined(LC_MESSAGES) && !defined(WIN32)
- check_locale_name(LC_MESSAGES, lc_messages, &canonname);
+ check_locale_name(LC_MESSAGES, lc_messages, &canonname, '\0');
lc_messages = canonname;
#else
/* when LC_MESSAGES is not available, use the LC_CTYPE setting */
- check_locale_name(LC_CTYPE, lc_messages, &canonname);
+ check_locale_name(LC_CTYPE, lc_messages, &canonname, '\0');
lc_messages = canonname;
#endif
+
+ set_collation_version();
}
/*
@@ -2592,6 +2793,9 @@ setup_locale_encoding(void)
lc_time);
}
+ printf(_("The default collation provider is \"%s\".\n"),
+ get_collprovider_name(collprovider));
+
if (!encoding)
{
int ctype_enc;
@@ -2642,8 +2846,8 @@ setup_locale_encoding(void)
else
encodingid = get_encoding_id(encoding);
- if (!check_locale_encoding(lc_ctype, encodingid) ||
- !check_locale_encoding(lc_collate, encodingid))
+ if (!check_locale_encoding(lc_ctype, encodingid, '\0') ||
+ !check_locale_encoding(lc_collate, encodingid, collprovider))
exit(1); /* check_locale_encoding printed the error */
}
@@ -3418,3 +3622,113 @@ main(int argc, char *argv[])
success = true;
return 0;
}
+
+#ifdef USE_ICU
+/*
+ * If locale is "" return the environment value from setlocale().
+ *
+ * Otherwise return a malloc'd copy of locale if it is not NULL.
+ *
+ * This should match the backend's check_icu_locale() function.
+ */
+static char *
+check_icu_locale_name(const char *locale)
+{
+ char *canonname = NULL;
+ char *winlocale = NULL;
+ char *result;
+
+ /* Windows locales can be in the format ".codepage" */
+ if (locale && (strlen(locale) == 0 || locale[0] == '.'))
+ {
+ check_locale_name(LC_COLLATE, locale, &canonname, COLLPROVIDER_LIBC);
+ locale = (const char *) canonname;
+ }
+
+#ifdef WIN32
+ if (!locale_is_c(locale))
+ {
+ winlocale = check_icu_winlocale(locale);
+
+ if (winlocale == NULL && locale != NULL)
+ exit(1); /* check_icu_winlocale printed the error */
+ else
+ locale = winlocale;
+ }
+#endif
+
+ result = locale ? pstrdup(locale) : NULL;
+
+ if (canonname)
+ pfree(canonname);
+ if (winlocale)
+ pfree(winlocale);
+
+ return result;
+}
+#endif /* USE_ICU */
+
+/*
+ * Setup the lc_collate version (get it from the collation provider).
+ */
+static void
+set_collation_version(void)
+{
+ char *wincollate = NULL;
+ char *langtag = NULL;
+ const char *collate;
+ bool failure;
+
+ Assert(lc_collate);
+ Assert(is_valid_nondefault_collprovider(collprovider));
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ collate = (const char *) lc_collate;
+
+#ifdef WIN32
+ if (!locale_is_c(collate))
+ {
+ wincollate = check_icu_winlocale(collate);
+
+ if (wincollate == NULL && collate != NULL)
+ exit(1); /* check_icu_winlocale printed the error */
+ else
+ collate = (const char *) wincollate;
+ }
+#endif /* WIN32 */
+
+ langtag = get_icu_language_tag(collate);
+ if (!langtag)
+ {
+ /* get_icu_language_tag printed the main error message */
+ fprintf(stderr, _("Rerun %s with a different locale selection.\n"),
+ progname);
+ exit(1);
+ }
+ collate = get_icu_collate(collate, langtag);
+#else /* not USE_ICU */
+ fprintf(stderr,
+ _("%s: ICU is not supported in this build\n"
+ "You need to rebuild PostgreSQL using --with-icu.\n"),
+ progname);
+ exit(1);
+#endif /* not USE_ICU */
+ }
+ else
+ {
+ /* COLLPROVIDER_LIBC */
+ collate = (const char *) lc_collate;
+ }
+
+ get_collation_actual_version(collprovider, collate, &collversion, &failure);
+ if (failure)
+ /* get_collation_actual_version printed the error */
+ exit(1);
+
+ if (langtag)
+ free(langtag);
+ if (wincollate)
+ free(wincollate);
+}
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index d7a081b9bb..132d0b9c9a 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -19,7 +19,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
-LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
OBJS=receivelog.o streamutil.o walmethods.o $(WIN32RES)
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index d3c1dce178..beb2146331 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,7 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
-LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
OBJS= pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \
pg_backup_null.o pg_backup_tar.o pg_backup_directory.o \
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4c98ae4d7f..d8c479b61e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -48,12 +48,14 @@
#include "catalog/pg_attribute_d.h"
#include "catalog/pg_cast_d.h"
#include "catalog/pg_class_d.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_default_acl_d.h"
#include "catalog/pg_largeobject_d.h"
#include "catalog/pg_largeobject_metadata_d.h"
#include "catalog/pg_proc_d.h"
#include "catalog/pg_trigger_d.h"
#include "catalog/pg_type_d.h"
+#include "common/pg_collation_fn_common.h"
#include "libpq/libpq-fs.h"
#include "storage/block.h"
@@ -13419,9 +13421,10 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
int i_collprovider;
int i_collcollate;
int i_collctype;
- const char *collprovider;
+ const char *collproviderstr;
const char *collcollate;
const char *collctype;
+ const char *collprovider_name;
/* Skip if not to be dumped */
if (!collinfo->dobj.dump || dopt->dataOnly)
@@ -13459,28 +13462,28 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
i_collcollate = PQfnumber(res, "collcollate");
i_collctype = PQfnumber(res, "collctype");
- collprovider = PQgetvalue(res, 0, i_collprovider);
+ collproviderstr = PQgetvalue(res, 0, i_collprovider);
collcollate = PQgetvalue(res, 0, i_collcollate);
collctype = PQgetvalue(res, 0, i_collctype);
+ /*
+ * Use COLLPROVIDER_DEFAULT to allow dumping pg_catalog; not accepted on
+ * input
+ */
+ collprovider_name = get_collprovider_name(collproviderstr[0]);
+ if (!collprovider_name)
+ exit_horribly(NULL,
+ "unrecognized collation provider: %s\n",
+ collproviderstr);
+
+
appendPQExpBuffer(delq, "DROP COLLATION %s;\n",
fmtQualifiedDumpable(collinfo));
appendPQExpBuffer(q, "CREATE COLLATION %s (",
fmtQualifiedDumpable(collinfo));
- appendPQExpBufferStr(q, "provider = ");
- if (collprovider[0] == 'c')
- appendPQExpBufferStr(q, "libc");
- else if (collprovider[0] == 'i')
- appendPQExpBufferStr(q, "icu");
- else if (collprovider[0] == 'd')
- /* to allow dumping pg_catalog; not accepted on input */
- appendPQExpBufferStr(q, "default");
- else
- exit_horribly(NULL,
- "unrecognized collation provider: %s\n",
- collprovider);
+ appendPQExpBuffer(q, "provider = %s", collprovider_name);
if (strcmp(collcollate, collctype) == 0)
{
diff --git a/src/bin/pg_rewind/Makefile b/src/bin/pg_rewind/Makefile
index 04f3b8f520..f42e8f018f 100644
--- a/src/bin/pg_rewind/Makefile
+++ b/src/bin/pg_rewind/Makefile
@@ -16,7 +16,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -I$(libpq_srcdir) -DFRONTEND $(CPPFLAGS)
-LDFLAGS_INTERNAL += $(libpq_pgport)
+LDFLAGS_INTERNAL += $(libpq_pgport) $(ICU_LIBS)
OBJS = pg_rewind.o parsexlog.o xlogreader.o datapagemap.o timeline.o \
fetch.o file_ops.o copy_fetch.o libpq_fetch.o filemap.o logging.o \
diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
index adb0d5d707..9c4d62098e 100644
--- a/src/bin/pg_upgrade/Makefile
+++ b/src/bin/pg_upgrade/Makefile
@@ -12,7 +12,7 @@ OBJS = check.o controldata.o dump.o exec.o file.o function.o info.o \
tablespace.o util.o version.o $(WIN32RES)
override CPPFLAGS := -DDLSUFFIX=\"$(DLSUFFIX)\" -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
-LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
all: pg_upgrade
diff --git a/src/bin/pgbench/Makefile b/src/bin/pgbench/Makefile
index 25abd0a875..e8609ba567 100644
--- a/src/bin/pgbench/Makefile
+++ b/src/bin/pgbench/Makefile
@@ -10,7 +10,7 @@ include $(top_builddir)/src/Makefile.global
OBJS = pgbench.o exprparse.o $(WIN32RES)
override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
-LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
ifneq ($(PORTNAME), win32)
override CFLAGS += $(PTHREAD_CFLAGS)
diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile
index 69bb297fe7..092a68350d 100644
--- a/src/bin/psql/Makefile
+++ b/src/bin/psql/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
REFDOCDIR= $(top_srcdir)/doc/src/sgml/ref
override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
-LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
OBJS= command.o common.o copy.o crosstabview.o \
describe.o help.o input.o large_obj.o mainloop.o \
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 779e48437c..febd9a55bb 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -17,7 +17,9 @@
#include "catalog/pg_attribute_d.h"
#include "catalog/pg_cast_d.h"
#include "catalog/pg_class_d.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_default_acl_d.h"
+#include "common/pg_collation_fn_common.h"
#include "fe_utils/string_utils.h"
#include "common.h"
@@ -4132,7 +4134,13 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
if (pset.sversion >= 100000)
appendPQExpBuffer(&buf,
- ",\n CASE c.collprovider WHEN 'd' THEN 'default' WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\"",
+ ",\n CASE c.collprovider WHEN '%c' THEN '%s' WHEN '%c' THEN '%s' WHEN '%c' THEN '%s' END AS \"%s\"",
+ COLLPROVIDER_DEFAULT,
+ get_collprovider_name(COLLPROVIDER_DEFAULT),
+ COLLPROVIDER_LIBC,
+ get_collprovider_name(COLLPROVIDER_LIBC),
+ COLLPROVIDER_ICU,
+ get_collprovider_name(COLLPROVIDER_ICU),
gettext_noop("Provider"));
if (verbose)
diff --git a/src/bin/scripts/Makefile b/src/bin/scripts/Makefile
index 9f352b5e2b..1d75de81ca 100644
--- a/src/bin/scripts/Makefile
+++ b/src/bin/scripts/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
PROGRAMS = createdb createuser dropdb dropuser clusterdb vacuumdb reindexdb pg_isready
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
-LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
all: $(PROGRAMS)
diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index b40eea4365..4896ad6600 100644
--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -58,6 +58,7 @@ main(int argc, char *argv[])
char *lc_collate = NULL;
char *lc_ctype = NULL;
char *locale = NULL;
+ char *canonname = NULL;
PQExpBufferData sql;
@@ -153,7 +154,15 @@ main(int argc, char *argv[])
progname);
exit(1);
}
- lc_ctype = locale;
+
+ /*
+ * remove the collation provider modifier from the locale for lc_ctype
+ */
+ check_locale_collprovider(locale, &canonname, NULL, NULL);
+ if (!canonname)
+ exit(1); /* check_locale_collprovider printed the error */
+ lc_ctype = canonname;
+
lc_collate = locale;
}
@@ -241,6 +250,9 @@ main(int argc, char *argv[])
PQfinish(conn);
+ if (canonname)
+ pfree(canonname);
+
exit(0);
}
diff --git a/src/common/Makefile b/src/common/Makefile
index d84c7b6e6a..6b6c4f0a08 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -50,7 +50,7 @@ OBJS_COMMON = base64.o config_info.o controldata_utils.o d2s.o exec.o f2s.o \
file_perm.o ip.o keywords.o kwlookup.o link-canary.o md5.o \
pg_lzcompress.o pgfnames.o psprintf.o relpath.o \
rmtree.o saslprep.o scram-common.o string.o unicode_norm.o \
- username.o wait_error.o
+ username.o wait_error.o pg_collation_fn_common.o
ifeq ($(with_openssl),yes)
OBJS_COMMON += sha2_openssl.o
diff --git a/src/common/pg_collation_fn_common.c b/src/common/pg_collation_fn_common.c
new file mode 100644
index 0000000000..a3ba3a368d
--- /dev/null
+++ b/src/common/pg_collation_fn_common.c
@@ -0,0 +1,90 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_collation_fn_common.c
+ * commmon routines to support manipulation of the pg_collation relation
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/common/pg_collation_fn_common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifdef FRONTEND
+#include "postgres_fe.h"
+#else
+#include "postgres.h"
+#endif
+
+#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
+
+
+/*
+ * Note that we search the table with pg_strcasecmp(), so variant
+ * capitalizations don't need their own entries.
+ */
+typedef struct collprovider_name
+{
+ char collprovider;
+ const char *name;
+} collprovider_name;
+
+static const collprovider_name collprovider_name_tbl[] =
+{
+ {COLLPROVIDER_DEFAULT, "default"},
+ {COLLPROVIDER_LIBC, "libc"},
+ {COLLPROVIDER_ICU, "icu"},
+ {'\0', NULL} /* end marker */
+};
+
+/*
+ * Get the collation provider from the given collation provider name.
+ *
+ * Return '\0' if we can't determine it.
+ */
+char
+get_collprovider(const char *name)
+{
+ int i;
+
+ if (!name)
+ return '\0';
+
+ /* Check the table */
+ for (i = 0; collprovider_name_tbl[i].name; ++i)
+ if (pg_strcasecmp(name, collprovider_name_tbl[i].name) == 0)
+ return collprovider_name_tbl[i].collprovider;
+
+ return '\0';
+}
+
+/*
+ * Get the name of the given collation provider.
+ *
+ * Return NULL if we can't determine it.
+ */
+const char *
+get_collprovider_name(char collprovider)
+{
+ int i;
+
+ /* Check the table */
+ for (i = 0; collprovider_name_tbl[i].collprovider; ++i)
+ if (collprovider_name_tbl[i].collprovider == collprovider)
+ return collprovider_name_tbl[i].name;
+
+ return NULL;
+}
+
+/*
+ * Return true if collation provider is nondefault and valid, and false otherwise.
+ */
+bool
+is_valid_nondefault_collprovider(char collprovider)
+{
+ return (collprovider == COLLPROVIDER_LIBC ||
+ collprovider == COLLPROVIDER_ICU);
+}
diff --git a/src/fe_utils/.gitignore b/src/fe_utils/.gitignore
index 37f5f7514d..b14041b5cf 100644
--- a/src/fe_utils/.gitignore
+++ b/src/fe_utils/.gitignore
@@ -1 +1,2 @@
/psqlscan.c
+/pg_collation_fn_common.c
diff --git a/src/fe_utils/Makefile b/src/fe_utils/Makefile
index 7d73800323..22e22ef1f4 100644
--- a/src/fe_utils/Makefile
+++ b/src/fe_utils/Makefile
@@ -19,7 +19,8 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
-OBJS = mbprint.o print.o psqlscan.o simple_list.o string_utils.o conditional.o
+OBJS = mbprint.o print.o psqlscan.o simple_list.o string_utils.o conditional.o \
+ pg_collation_fn_common.o
all: libpgfeutils.a
@@ -33,6 +34,13 @@ psqlscan.c: FLEX_FIX_WARNING=yes
distprep: psqlscan.c
+# Pull in pg_collation_fn_common.c from src/common. That exposes us to
+# risks of version skew if we link to a shared library. Do it the
+# hard way, instead, so that we're statically linked.
+
+pg_collation_fn_common.c: % : $(top_srcdir)/src/common/%
+ rm -f $@ && $(LN_S) $< .
+
# libpgfeutils could be useful to contrib, so install it
install: all installdirs
$(INSTALL_STLIB) libpgfeutils.a '$(DESTDIR)$(libdir)/libpgfeutils.a'
@@ -45,6 +53,7 @@ uninstall:
clean distclean:
rm -f libpgfeutils.a $(OBJS) lex.backup
+ rm -f pg_collation_fn_common.c
# psqlscan.c is supposed to be in the distribution tarball,
# so do not clean it in the clean/distclean rules
diff --git a/src/include/commands/dbcommands.h b/src/include/commands/dbcommands.h
index 28bf21153d..3631ddb313 100644
--- a/src/include/commands/dbcommands.h
+++ b/src/include/commands/dbcommands.h
@@ -29,6 +29,7 @@ extern ObjectAddress AlterDatabaseOwner(const char *dbname, Oid newOwnerId);
extern Oid get_database_oid(const char *dbname, bool missingok);
extern char *get_database_name(Oid dbid);
-extern void check_encoding_locale_matches(int encoding, const char *collate, const char *ctype);
+extern void check_encoding_locale_matches(int encoding, const char *collate, const char *ctype,
+ char collprovider);
#endif /* DBCOMMANDS_H */
diff --git a/src/include/common/pg_collation_fn_common.h b/src/include/common/pg_collation_fn_common.h
new file mode 100644
index 0000000000..f05778dfad
--- /dev/null
+++ b/src/include/common/pg_collation_fn_common.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_collation_fn_common.h
+ * prototypes for functions in common/pg_collation_fn_common.c
+ *
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/pg_collation_fn_common.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef PG_COLLATION_FN_COMMON_H
+#define PG_COLLATION_FN_COMMON_H
+
+extern char get_collprovider(const char *name);
+extern const char *get_collprovider_name(char collprovider);
+extern bool is_valid_nondefault_collprovider(char collprovider);
+
+#endif /* PG_COLLATION_FN_COMMON_H */
diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32
index dfd6972383..201db75fd0 100644
--- a/src/include/pg_config.h.win32
+++ b/src/include/pg_config.h.win32
@@ -713,6 +713,10 @@
/* Define to use /dev/urandom for random number generation */
/* #undef USE_DEV_URANDOM */
+/* Define to build with ICU support. (--with-icu) */
+/* #undef USE_ICU */
+
+
/* Define to 1 to build with LDAP support. (--with-ldap) */
/* #undef USE_LDAP */
diff --git a/src/include/port.h b/src/include/port.h
index 7e2004b178..b1a023dc34 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -502,6 +502,40 @@ extern int pg_get_encoding_from_locale(const char *ctype, bool write_message);
extern int pg_codepage_to_encoding(UINT cp);
#endif
+/* do not make libpq with icu */
+#ifndef LIBPQ_MAKE
+
+extern void check_locale_collprovider(const char *locale, char **canonname,
+ char *collprovider, char **collversion);
+extern bool locale_is_c(const char *locale);
+extern char *get_full_collation_name(const char *locale, char collprovider,
+ const char *collversion);
+
+#ifdef FRONTEND
+extern void get_collation_actual_version(char collprovider,
+ const char *collcollate,
+ char **collversion, bool *failure);
+#else
+extern char *get_collation_actual_version(char collprovider,
+ const char *collcollate);
+#endif
+
+#ifdef USE_ICU
+#define ICU_ROOT_LOCALE "root"
+
+/* Users of this must import unicode/ucol.h too. */
+struct UCollator;
+extern struct UCollator *open_collator(const char *collate);
+
+extern char * get_icu_language_tag(const char *localename);
+extern const char *get_icu_collate(const char *locale, const char *langtag);
+#ifdef WIN32
+extern char * check_icu_winlocale(const char *winlocale);
+#endif /* WIN32 */
+#endif /* USE_ICU */
+
+#endif /* not LIBPQ_MAKE */
+
/* port/inet_net_ntop.c */
extern char *inet_net_ntop(int af, const void *src, int bits,
char *dst, size_t size);
diff --git a/src/include/port/win32.h b/src/include/port/win32.h
index 9f48a58aed..7e3e7e57e6 100644
--- a/src/include/port/win32.h
+++ b/src/include/port/win32.h
@@ -16,7 +16,7 @@
* get support for GetLocaleInfoEx() with locales. For everything else
* the minimum version is Windows XP (0x0501).
*/
-#if defined(_MSC_VER) && _MSC_VER >= 1900
+#if defined(_MSC_VER) && _MSC_VER >= 1800
#define MIN_WINNT 0x0600
#else
#define MIN_WINNT 0x0501
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 606952afd7..29c4894a9f 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -57,8 +57,10 @@ extern void assign_locale_numeric(const char *newval, void *extra);
extern bool check_locale_time(char **newval, void **extra, GucSource source);
extern void assign_locale_time(const char *newval, void *extra);
-extern bool check_locale(int category, const char *locale, char **canonname);
-extern char *pg_perm_setlocale(int category, const char *locale);
+extern bool check_locale(int category, const char *locale, char **canonname,
+ char collprovider);
+extern const char *pg_perm_setlocale(int category, const char *locale,
+ char collprovider);
extern void check_strxfrm_bug(void);
extern bool lc_collate_is_c(Oid collation);
@@ -102,11 +104,11 @@ typedef struct pg_locale_struct *pg_locale_t;
extern pg_locale_t pg_newlocale_from_collation(Oid collid);
-extern char *get_collation_actual_version(char collprovider, const char *collcollate);
-
#ifdef USE_ICU
extern int32_t icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes);
extern int32_t icu_from_uchar(char **result, const UChar *buff_uchar, int32_t len_uchar);
+extern const char *get_icu_default_collate(void);
+extern UCollator *get_default_collation_collator(void);
#endif
/* These functions convert from/to libc's wchar_t, *not* pg_wchar_t */
@@ -115,4 +117,6 @@ extern size_t wchar2char(char *to, const wchar_t *from, size_t tolen,
extern size_t char2wchar(wchar_t *to, size_t tolen,
const char *from, size_t fromlen, pg_locale_t locale);
+extern char get_default_collprovider(void);
+
#endif /* _PG_LOCALE_ */
diff --git a/src/interfaces/libpq/.gitignore b/src/interfaces/libpq/.gitignore
index 38779b23a4..b7d24dd369 100644
--- a/src/interfaces/libpq/.gitignore
+++ b/src/interfaces/libpq/.gitignore
@@ -4,3 +4,4 @@
# .c files that are symlinked in from elsewhere
/encnames.c
/wchar.c
+/pg_collation_fn_common.c
diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile
index 025542dfe9..70bb769a7f 100644
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -19,10 +19,11 @@ NAME= pq
SO_MAJOR_VERSION= 5
SO_MINOR_VERSION= $(MAJORVERSION)
-override CPPFLAGS := -DFRONTEND -DUNSAFE_STAT_OK -I$(srcdir) $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port
+override CPPFLAGS := -DFRONTEND -DUNSAFE_STAT_OK -I$(srcdir) $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port -DLIBPQ_MAKE
ifneq ($(PORTNAME), win32)
override CFLAGS += $(PTHREAD_CFLAGS)
endif
+LDFLAGS_INTERNAL += $(ICU_LIBS)
# The MSVC build system scrapes OBJS from this file. If you change any of
# the conditional additions of files to OBJS, update Mkvcbuild.pm to match.
diff --git a/src/port/chklocale.c b/src/port/chklocale.c
index 9b753c85e9..02d05e7014 100644
--- a/src/port/chklocale.c
+++ b/src/port/chklocale.c
@@ -23,8 +23,26 @@
#include <langinfo.h>
#endif
+#ifdef USE_ICU
+#include <unicode/ucol.h>
+#endif
+
+#include "catalog/pg_collation.h"
+#include "common/pg_collation_fn_common.h"
#include "mb/pg_wchar.h"
+/*
+ * In backend, we will use palloc/pfree. In frontend, use malloc/free.
+ */
+#ifndef FRONTEND
+#define STRDUP(s) pstrdup(s)
+#define ALLOC(size) palloc(size)
+#define FREE(s) pfree(s)
+#else
+#define STRDUP(s) strdup(s)
+#define ALLOC(size) malloc(size)
+#define FREE(s) free(s)
+#endif
/*
* This table needs to recognize all the CODESET spellings for supported
@@ -436,3 +454,583 @@ pg_get_encoding_from_locale(const char *ctype, bool write_message)
}
#endif /* (HAVE_LANGINFO_H && CODESET) || WIN32 */
+
+/* do not make libpq with icu */
+#ifndef LIBPQ_MAKE
+
+/*
+ * Check if the locale contains the modifier of the collation provider.
+ *
+ * Set up the collation provider according to the appropriate modifier or '\0'.
+ * Set up the collation version to NULL if we don't find it after the collation
+ * provider modifier.
+ *
+ * The malloc'd copy of the locale's canonical name without the modifier of the
+ * collation provider and the collation version is stored in the canonname if
+ * locale is not NULL. The canoname can have the zero length.
+ */
+void
+check_locale_collprovider(const char *locale, char **canonname,
+ char *collprovider, char **collversion)
+{
+ const char *modifier_sign,
+ *dot_sign,
+ *cur_collprovider_end;
+ char cur_collprovider_name[NAMEDATALEN];
+ int cur_collprovider_len;
+ char cur_collprovider;
+
+ /* in case of failure or if we don't find them in the locale name */
+ if (canonname)
+ *canonname = NULL;
+ if (collprovider)
+ *collprovider = '\0';
+ if (collversion)
+ *collversion = NULL;
+
+ if (!locale)
+ return;
+
+ /* find the last occurrence of the modifier sign '@' in the locale */
+ modifier_sign = strrchr(locale, '@');
+
+ if (!modifier_sign)
+ {
+ /* just copy all the name */
+ if (canonname)
+ *canonname = STRDUP(locale);
+ return;
+ }
+
+ /* check if there's a version after the collation provider modifier */
+ if ((dot_sign = strchr(modifier_sign, '.')) == NULL)
+ cur_collprovider_end = &locale[strlen(locale)];
+ else
+ cur_collprovider_end = dot_sign;
+
+ cur_collprovider_len = cur_collprovider_end - modifier_sign - 1;
+ if (cur_collprovider_len + 1 > NAMEDATALEN)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("collation provider name is too long: %s"), locale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("collation provider name is too long: %s", locale)));
+#endif /* not FRONTEND */
+ return;
+ }
+
+ strncpy(cur_collprovider_name, modifier_sign + 1, cur_collprovider_len);
+ cur_collprovider_name[cur_collprovider_len] = '\0';
+
+ /* check if this is a valid collprovider name */
+ cur_collprovider = get_collprovider(cur_collprovider_name);
+ if (is_valid_nondefault_collprovider(cur_collprovider))
+ {
+ if (collprovider)
+ *collprovider = cur_collprovider;
+
+ if (canonname)
+ {
+ int canonname_len = modifier_sign - locale;
+
+ *canonname = ALLOC((canonname_len + 1) * sizeof(char));
+ if (*canonname)
+ {
+ strncpy(*canonname, locale, canonname_len);
+ (*canonname)[canonname_len] = '\0';
+ }
+ else
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("out of memory"));
+ /*
+ * keep newline separate so there's only one translatable string
+ */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR, (errmsg("out of memory")));
+#endif /* not FRONTEND */
+ }
+ }
+
+ if (dot_sign && collversion)
+ *collversion = STRDUP(dot_sign + 1);
+ }
+ else
+ {
+ /* just copy all the name */
+ if (canonname)
+ *canonname = STRDUP(locale);
+ }
+}
+
+/*
+ * Return true if locale is "C" or "POSIX";
+ */
+bool
+locale_is_c(const char *locale)
+{
+ return locale && (strcmp(locale, "C") == 0 || strcmp(locale, "POSIX") == 0);
+}
+
+/*
+ * Return locale ended with collation provider modifier and collation version.
+ *
+ * Return NULL if locale is NULL.
+ */
+char *
+get_full_collation_name(const char *locale, char collprovider,
+ const char *collversion)
+{
+ char *new_locale;
+ int old_len,
+ len_with_provider,
+ new_len;
+ const char *collprovider_name;
+
+ if (!locale)
+ return NULL;
+
+ collprovider_name = get_collprovider_name(collprovider);
+ Assert(collprovider_name);
+
+ old_len = strlen(locale);
+ new_len = len_with_provider = old_len + 1 + strlen(collprovider_name);
+ if (collversion && *collversion)
+ new_len += 1 + strlen(collversion);
+
+ new_locale = ALLOC((new_len + 1) * sizeof(char));
+ if (!new_locale)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("out of memory"));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR, (errmsg("out of memory")));
+#endif /* not FRONTEND */
+
+ return NULL;
+ }
+
+ /* add the collation provider modifier */
+ strcpy(new_locale, locale);
+ new_locale[old_len] = '@';
+ strcpy(&new_locale[old_len + 1], collprovider_name);
+
+ /* add the collation version if needed */
+ if (collversion && *collversion)
+ {
+ new_locale[len_with_provider] = '.';
+ strcpy(&new_locale[len_with_provider + 1], collversion);
+ }
+
+ new_locale[new_len] = '\0';
+
+ return new_locale;
+}
+
+/*
+ * Get provider-specific collation version string for the given collation from
+ * the operating system/library.
+ *
+ * A particular provider must always either return a non-NULL string or return
+ * NULL (if it doesn't support versions). It must not return NULL for some
+ * collcollate and not NULL for others.
+ */
+#ifdef FRONTEND
+void
+get_collation_actual_version(char collprovider, const char *collcollate,
+ char **collversion, bool *failure)
+{
+ if (failure)
+ *failure = false;
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ UCollator *collator = open_collator(collcollate);
+ UVersionInfo versioninfo;
+ char buf[U_MAX_VERSION_STRING_LENGTH];
+
+ if (collator)
+ {
+ ucol_getVersion(collator, versioninfo);
+ ucol_close(collator);
+
+ u_versionToString(versioninfo, buf);
+ if (collversion)
+ *collversion = STRDUP(buf);
+ }
+ else
+ {
+ if (collversion)
+ *collversion = NULL;
+ if (failure)
+ *failure = true;
+ }
+ }
+ else
+#endif
+ {
+ if (collversion)
+ *collversion = NULL;
+ }
+}
+#else /* not FRONTEND */
+char *
+get_collation_actual_version(char collprovider, const char *collcollate)
+{
+ char *collversion;
+
+#ifdef USE_ICU
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ UCollator *collator = open_collator(collcollate);
+ UVersionInfo versioninfo;
+ char buf[U_MAX_VERSION_STRING_LENGTH];
+
+ ucol_getVersion(collator, versioninfo);
+ ucol_close(collator);
+
+ u_versionToString(versioninfo, buf);
+ collversion = STRDUP(buf);
+ }
+ else
+#endif
+ collversion = NULL;
+
+ return collversion;
+}
+#endif /* not FRONTEND */
+
+#ifdef USE_ICU
+/*
+ * Open the collator for this icu locale. Return NULL in case of failure.
+ */
+UCollator *
+open_collator(const char *collate)
+{
+ UCollator *collator;
+ UErrorCode status;
+ const char *save = uloc_getDefault();
+ char *save_dup;
+
+ if (!save)
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: uloc_getDefault() failed"));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR, (errmsg("ICU error: uloc_getDefault() failed")));
+#endif
+ return NULL;
+ }
+
+ /* save may be pointing at a modifiable scratch variable, so copy it. */
+ save_dup = STRDUP(save);
+
+ /* set the default locale to root */
+ status = U_ZERO_ERROR;
+ uloc_setDefault(ICU_ROOT_LOCALE, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: failed to set the default locale to \"%s\": %s"),
+ ICU_ROOT_LOCALE, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to set the default locale to \"%s\": %s",
+ ICU_ROOT_LOCALE, u_errorName(status))));
+#endif
+ return NULL;
+ }
+
+ /* get a collator for this collate */
+ status = U_ZERO_ERROR;
+ collator = ucol_open(collate, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: could not open collator for locale \"%s\": %s"),
+ collate, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: could not open collator for locale \"%s\": %s",
+ collate, u_errorName(status))));
+#endif
+ collator = NULL;
+ }
+
+ /* restore old value of the default locale. */
+ status = U_ZERO_ERROR;
+ uloc_setDefault(save_dup, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr, _("ICU error: failed to restore old locale \"%s\": %s"),
+ save_dup, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to restore old locale \"%s\": %s",
+ save_dup, u_errorName(status))));
+#endif
+ return NULL;
+ }
+ FREE(save_dup);
+
+ return collator;
+}
+
+/*
+ * Get the ICU language tag for a locale name.
+ * The result is a palloc'd string.
+ * Return NULL in case of failure or if localename is NULL.
+ */
+char *
+get_icu_language_tag(const char *localename)
+{
+ char buf[ULOC_FULLNAME_CAPACITY];
+ UErrorCode status = U_ZERO_ERROR;
+
+ if (!localename)
+ return NULL;
+
+ uloc_toLanguageTag(localename, buf, sizeof(buf), TRUE, &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("ICU error: could not convert locale name \"%s\" to language tag: %s"),
+ localename, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: could not convert locale name \"%s\" to language tag: %s",
+ localename, u_errorName(status))));
+#endif
+ return NULL;
+ }
+ return STRDUP(buf);
+}
+
+/*
+ * Get the icu collation name.
+ */
+const char *
+get_icu_collate(const char *locale, const char *langtag)
+{
+ return U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : locale;
+}
+
+#ifdef WIN32
+/*
+ * Get the Language Code Identifier (LCID) for the Windows locale.
+ *
+ * Return zero in case of failure.
+ */
+static uint32
+get_lcid(const wchar_t *winlocale)
+{
+ /*
+ * The second argument to the LocaleNameToLCID function is:
+ * - Prior to Windows 7: reserved; should always be 0.
+ * - Beginning in Windows 7: use LOCALE_ALLOW_NEUTRAL_NAMES to allow the
+ * return of lcids of locales without regions.
+ */
+#if (NTDDI_VERSION >= NTDDI_WIN7)
+ return LocaleNameToLCID(winlocale, LOCALE_ALLOW_NEUTRAL_NAMES);
+#else
+ return LocaleNameToLCID(winlocale, 0);
+#endif
+}
+
+/*
+ * char2wchar_ascii --- convert multibyte characters to wide characters
+ *
+ * This is a simplified version of the char2wchar() function from backend.
+ */
+static size_t
+char2wchar_ascii(wchar_t *to, size_t tolen, const char *from, size_t fromlen)
+{
+ size_t result;
+
+ if (tolen == 0)
+ return 0;
+
+ /* Win32 API does not work for zero-length input */
+ if (fromlen == 0)
+ result = 0;
+ else
+ {
+ result = MultiByteToWideChar(CP_ACP, 0, from, fromlen, to, tolen - 1);
+ /* A zero return is failure */
+ if (result == 0)
+ result = -1;
+ }
+
+ if (result != -1)
+ {
+ Assert(result < tolen);
+ /* Append trailing null wchar (MultiByteToWideChar() does not) */
+ to[result] = 0;
+ }
+
+ return result;
+}
+
+/*
+ * Get the canonical ICU name for the Windows locale.
+ *
+ * Return a malloc'd string or NULL in case of failure.
+ */
+char *
+check_icu_winlocale(const char *winlocale)
+{
+ uint32 lcid;
+ char canonname_buf[ULOC_FULLNAME_CAPACITY];
+ UErrorCode status = U_ZERO_ERROR;
+#if (_MSC_VER >= 1400) /* VC8.0 or later */
+ _locale_t loct = NULL;
+#endif
+
+ if (winlocale == NULL)
+ return NULL;
+
+ /* Get the Language Code Identifier (LCID). */
+
+#if (_MSC_VER >= 1400) /* VC8.0 or later */
+ loct = _create_locale(LC_COLLATE, winlocale);
+
+ if (loct != NULL)
+ {
+#if (_MSC_VER >= 1700) /* Visual Studio 2012 or later */
+ if ((lcid = get_lcid(loct->locinfo->locale_name[LC_COLLATE])) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif /* not FRONTEND */
+ _free_locale(loct);
+ return NULL;
+ }
+#else /* _MSC_VER >= 1400 && _MSC_VER < 1700 */
+ if ((lcid = loct->locinfo->lc_handle[LC_COLLATE]) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else /* not FRONTEND */
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif /* not FRONTEND */
+ _free_locale(loct);
+ return NULL;
+ }
+#endif /* _MSC_VER >= 1400 && _MSC_VER < 1700 */
+ _free_locale(loct);
+ }
+ else
+#endif /* VC8.0 or later */
+ {
+ if (strlen(winlocale) == 0)
+ {
+ lcid = LOCALE_USER_DEFAULT;
+ }
+ else
+ {
+ size_t locale_len = strlen(winlocale);
+ wchar_t *wlocale = (wchar_t*) ALLOC(
+ (locale_len + 1) * sizeof(wchar_t));
+ /* Locale names use only ASCII */
+ size_t locale_wlen = char2wchar_ascii(wlocale, locale_len + 1,
+ winlocale, locale_len);
+ if (locale_wlen == -1)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to convert locale \"%s\" to wide characters"),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("failed to convert locale \"%s\" to wide characters",
+ winlocale)));
+#endif
+ FREE(wlocale);
+ return NULL;
+ }
+
+ if ((lcid = get_lcid(wlocale)) == 0)
+ {
+ /* there's an error */
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("failed to get the Language Code Identifier (LCID) for locale \"%s\""),
+ winlocale);
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("failed to get the Language Code Identifier (LCID) for locale \"%s\"",
+ winlocale)));
+#endif
+ FREE(wlocale);
+ return NULL;
+ }
+
+ FREE(wlocale);
+ }
+ }
+
+ /* Get the ICU canoname. */
+
+ uloc_getLocaleForLCID(lcid, canonname_buf, sizeof(canonname_buf), &status);
+ if (U_FAILURE(status))
+ {
+#ifdef FRONTEND
+ fprintf(stderr,
+ _("ICU error: failed to get the locale name for LCID 0x%04x: %s"),
+ lcid, u_errorName(status));
+ /* keep newline separate so there's only one translatable string */
+ fputc('\n', stderr);
+#else
+ ereport(ERROR,
+ (errmsg("ICU error: failed to get the locale name for LCID 0x%04x: %s",
+ lcid, u_errorName(status))));
+#endif
+ return NULL;
+ }
+
+ return STRDUP(canonname_buf);
+}
+#endif /* WIN32 */
+#endif /* USE_ICU */
+
+#endif /* not LIBPQ_MAKE */
diff --git a/src/test/Makefile b/src/test/Makefile
index efb206aa75..d74c6150ef 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
top_builddir = ../..
include $(top_builddir)/src/Makefile.global
-SUBDIRS = perl regress isolation modules authentication recovery subscription
+SUBDIRS = perl regress isolation modules authentication recovery subscription default_collation
# Test suites that are not safe by default but can be run if selected
# by the user via the whitespace-separated list in variable
diff --git a/src/test/default_collation/Makefile b/src/test/default_collation/Makefile
new file mode 100644
index 0000000000..2efe8becb7
--- /dev/null
+++ b/src/test/default_collation/Makefile
@@ -0,0 +1,28 @@
+# src/test/default_collation/Makefile
+
+subdir = src/test/default_collation
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+ifeq ($(with_icu),yes)
+check:
+ $(MAKE) -C icu check
+check-utf8:
+ $(MAKE) -C icu.utf8 check
+ $(MAKE) -C libc.utf8 check
+else
+check:
+ $(MAKE) -C libc check
+check-utf8:
+ $(MAKE) -C libc.utf8 check
+endif
+
+# We don't check libc/ if with_icu or vice versa, but we do want "make clean" to
+# recurse into it. The same goes for libc.utf8/ or icu.utf8/, which we don't
+# check by default.
+ALWAYS_SUBDIRS = libc libc.utf8 icu icu.utf8
+
+clean distclean maintainer-clean:
+ for d in $(ALWAYS_SUBDIRS); do \
+ $(MAKE) -C $$d clean || exit; \
+ done
diff --git a/src/test/default_collation/icu.utf8/.gitignore b/src/test/default_collation/icu.utf8/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/icu.utf8/Makefile b/src/test/default_collation/icu.utf8/Makefile
new file mode 100644
index 0000000000..7adecfd240
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/icu.utf8/Makefile
+
+subdir = src/test/default_collation/icu.utf8
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/icu.utf8/t/001_default_collation.pl b/src/test/default_collation/icu.utf8/t/001_default_collation.pl
new file mode 100644
index 0000000000..617c06d2d7
--- /dev/null
+++ b/src/test/default_collation/icu.utf8/t/001_default_collation.pl
@@ -0,0 +1,799 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 188;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"$expected_collprovider\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+sub psql
+{
+ my ($command, $db) = @_;
+ my ($result, $in, $out, $err);
+ my @psql = ('psql', '-X', '-c', $command);
+ if (defined($db))
+ {
+ push(@psql, $db);
+ }
+ print "# Running: " . join(" ", @psql) . "\n";
+ $result = IPC::Run::run \@psql, \$in, \$out, \$err;
+ ($result, $out, $err);
+}
+
+# --locale
+
+test_initdb(
+ "en_US.utf8 locale",
+ "--locale=en_US.utf8",
+ "icu",
+ "");
+
+test_initdb(
+ "en_US.utf8 locale with C ctype",
+ "--locale=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_initdb(
+ "be_BY\@latin icu locale",
+ "--locale=be_BY\@latin\@icu",
+ "icu",
+ "");
+
+test_initdb(
+ "be_BY\@latin icu locale invalid modifier order",
+ "--locale=be_BY\@icu\@latin",
+ "",
+ "invalid locale name \"be_BY\@icu\@latin\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_initdb(
+ "en_US.utf8 lc_collate",
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8",
+ "icu",
+ "");
+
+test_initdb(
+ "en_US.utf8 lc_collate with C ctype",
+ "--lc-collate=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_initdb(
+ "be_BY\@latin icu lc_collate",
+ "--lc-collate=be_BY\@latin\@icu --lc-ctype=be_BY\@latin",
+ "icu",
+ "");
+
+test_initdb(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@icu\@latin",
+ "",
+ "invalid locale name \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb",
+ split(" ", $options),
+ "--template=template0",
+ "mydb");
+
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name,
+ $options,
+ $expected_collprovider,
+ $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ ($result, $out_command, $err_command) = psql(
+ "create database mydb "
+ . $options
+ . " template = template0;");
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_default_collation
+{
+ my ($createdb_options, $collation, $expected_collprovider, @commands) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb", split(" ", $createdb_options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "\"@command\" check output");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "\"@command\" check output");
+ }
+
+ for (my $row = 0; $row <= $#commands; $row++)
+ {
+ my ($command_text, $expected) = @{$commands[$row]};
+ ($result, $out_command, $err_command) = psql($command_text, "mydb");
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ if ($out_command)
+ {
+ is(
+ $out_command,
+ $expected,
+ "default collation "
+ . $collation
+ . ": \""
+ . $command_text
+ . "\" check output");
+ }
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+my @command = ("createuser --createdb --no-superuser non_superuser");
+print "# Running: " . join(" ", @command) . "\n";
+system(@command);
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "en_US.utf8 locale",
+ "--locale=en_US.utf8",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu locale",
+ "--locale=be_BY\@latin\@icu",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu locale invalid modifier order",
+ "--locale=be_BY\@icu\@latin",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_createdb(
+ "en_US.utf8 lc_collate",
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8",
+ "icu",
+ "");
+
+test_createdb(
+ "en_US.utf8 lc_collate with C ctype",
+ "--lc-collate=en_US.utf8 --lc-ctype=C",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_createdb(
+ "be_BY\@latin icu lc_collate",
+ "--lc-collate=be_BY\@latin\@icu --lc-ctype=be_BY\@latin",
+ "icu",
+ "");
+
+test_createdb(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@icu\@latin",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+test_create_database(
+ "en_US.utf8 lc_collate",
+ "LC_COLLATE = 'en_US.utf8' LC_CTYPE = 'en_US.utf8'",
+ "icu",
+ "");
+
+test_create_database(
+ "en_US.utf8 lc_collate with C ctype",
+ "LC_COLLATE = 'en_US.utf8' LC_CTYPE = 'C'",
+ "",
+ "collations with different collate and ctype values are not supported by "
+ . "ICU");
+
+test_create_database(
+ "be_BY\@latin icu lc_collate",
+ "LC_COLLATE = 'be_BY\@latin' LC_CTYPE = 'be_BY\@latin'",
+ "icu",
+ "");
+
+test_create_database(
+ "be_BY\@latin icu lc_collate invalid modifier order",
+ "LC_COLLATE = 'be_BY\@icu\@latin'",
+ "",
+ "invalid locale name: \"be_BY\@icu\@latin\" \\(provider \"libc\"\\)");
+
+# test default collation behaviour
+# use commands and outputs from the regression test collate.icu.utf8
+
+test_default_collation(
+ "--lc-collate=en_US.utf8 --lc-ctype=en_US.utf8 --template=template0",
+ "en_US.utf8\@icu",
+ "icu",
+ (
+ [
+ "CREATE TABLE collate_test1 (a int, b text NOT NULL);",
+ "CREATE TABLE\n"
+ ],
+ [
+ "INSERT INTO collate_test1 VALUES "
+ . "(1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');",
+ "INSERT 0 4\n"],
+ [
+ "SELECT * FROM collate_test1 WHERE b >= 'bbc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 3 | bbc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # star expansion
+ [
+ "SELECT * FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # upper/lower
+ ["CREATE TABLE collate_test10 (a int, x text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test10 VALUES (1, 'hij'), (2, 'HIJ');",
+ "INSERT 0 2\n"
+ ],
+ [
+ "SELECT a, lower(x), upper(x), initcap(x) FROM collate_test10;",
+ " a | lower | upper | initcap \n"
+ . "---+-------+-------+---------\n"
+ . " 1 | hij | HIJ | Hij\n"
+ . " 2 | hij | HIJ | Hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ # LIKE/ILIKE
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ILIKE '%KI%' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ILIKE 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ # regular expressions
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE TABLE collate_test6 (a int, b text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test6 VALUES "
+ . "(1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'), "
+ . "(5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, ' '), "
+ . "(9, 'äbç'), (10, 'ÄBÇ');",
+ "INSERT 0 10\n"
+ ],
+ [
+ "SELECT b, "
+ . "b ~ '^[[:alpha:]]+\$' AS is_alpha, "
+ . "b ~ '^[[:upper:]]+\$' AS is_upper, "
+ . "b ~ '^[[:lower:]]+\$' AS is_lower, "
+ . "b ~ '^[[:digit:]]+\$' AS is_digit, "
+ . "b ~ '^[[:alnum:]]+\$' AS is_alnum, "
+ . "b ~ '^[[:graph:]]+\$' AS is_graph, "
+ . "b ~ '^[[:print:]]+\$' AS is_print, "
+ . "b ~ '^[[:punct:]]+\$' AS is_punct, "
+ . "b ~ '^[[:space:]]+\$' AS is_space "
+ . "FROM collate_test6;",
+ " b | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space \n"
+ . "-----+----------+----------+----------+----------+----------+----------+----------+----------+----------\n"
+ . " abc | t | f | t | f | t | t | t | f | f\n"
+ . " ABC | t | t | f | f | t | t | t | f | f\n"
+ . " 123 | f | f | f | t | t | t | t | f | f\n"
+ . " ab1 | f | f | f | f | t | t | t | f | f\n"
+ . " a1! | f | f | f | f | f | t | t | f | f\n"
+ . " a c | f | f | f | f | f | f | t | f | f\n"
+ . " !.; | f | f | f | f | f | t | t | t | f\n"
+ . " | f | f | f | f | f | f | t | f | t\n"
+ . " äbç | t | f | t | f | t | t | t | f | f\n"
+ . " ÄBÇ | t | t | f | f | t | t | t | f | f\n"
+ . "(10 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ~* 'KI' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ~* 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(coalesce(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;",
+ " a | b | greatest \n"
+ . "---+-----+----------\n"
+ . " 1 | abc | CCC\n"
+ . " 2 | äbc | CCC\n"
+ . " 3 | bbc | CCC\n"
+ . " 4 | ABC | CCC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, x, lower(greatest(x, 'foo')) FROM collate_test10;",
+ " a | x | lower \n"
+ . "---+-----+-------\n"
+ . " 1 | hij | hij\n"
+ . " 2 | HIJ | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;",
+ " a | nullif \n"
+ . "---+--------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 1 | \n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(nullif(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END "
+ . "FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 1 | abcd\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE DOMAIN testdomain AS text;", "CREATE DOMAIN\n", ""],
+ [
+ "SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(x::testdomain) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT min(b), max(b) FROM collate_test1;",
+ " min | max \n"
+ . "-----+-----\n"
+ . " abc | bbc\n"
+ . "(1 row)\n"
+ . "\n",
+ ""
+ ],
+ [
+ "SELECT array_agg(b ORDER BY b) FROM collate_test1;",
+ " array_agg \n"
+ . "-------------------\n"
+ . " {abc,ABC,äbc,bbc}\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 "
+ . "UNION ALL "
+ . "SELECT a, b FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 3 | bbc\n"
+ . "(8 rows)\n"
+ . "\n"
+ ],
+ # casting
+ [
+ "SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # propagation of collation in SQL functions (inlined and non-inlined
+ # cases) and plpgsql functions too
+ [
+ "CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_noninline (text, text) "
+ . "RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 limit 1 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_plpgsql (text, text) "
+ . "RETURNS boolean LANGUAGE plpgsql "
+ . "AS \$\$ begin return \$1 < \$2; end \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a.b AS a, b.b AS b, a.b < b.b AS lt, "
+ . "mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b) "
+ . "FROM collate_test1 a, collate_test1 b "
+ . "ORDER BY a.b, b.b;",
+ " a | b | lt | mylt | mylt_noninline | mylt_plpgsql \n"
+ . "-----+-----+----+------+----------------+--------------\n"
+ . " abc | abc | f | f | f | f\n"
+ . " abc | ABC | t | t | t | t\n"
+ . " abc | äbc | t | t | t | t\n"
+ . " abc | bbc | t | t | t | t\n"
+ . " ABC | abc | f | f | f | f\n"
+ . " ABC | ABC | f | f | f | f\n"
+ . " ABC | äbc | t | t | t | t\n"
+ . " ABC | bbc | t | t | t | t\n"
+ . " äbc | abc | f | f | f | f\n"
+ . " äbc | ABC | f | f | f | f\n"
+ . " äbc | äbc | f | f | f | f\n"
+ . " äbc | bbc | t | t | t | t\n"
+ . " bbc | abc | f | f | f | f\n"
+ . " bbc | ABC | f | f | f | f\n"
+ . " bbc | äbc | f | f | f | f\n"
+ . " bbc | bbc | f | f | f | f\n"
+ . "(16 rows)\n"
+ . "\n"
+ ],
+ # polymorphism
+ [
+ "SELECT * FROM unnest("
+ . "(SELECT array_agg(b ORDER BY b) FROM collate_test1)"
+ . ") ORDER BY 1;",
+ " unnest \n"
+ . "--------\n"
+ . " abc\n"
+ . " ABC\n"
+ . " äbc\n"
+ . " bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "CREATE FUNCTION dup (anyelement) RETURNS anyelement "
+ . "AS 'select \$1' LANGUAGE sql;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a, dup(b) FROM collate_test1 ORDER BY 2;",
+ " a | dup \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # indexes
+ [
+ "CREATE INDEX collate_test1_idx1 ON collate_test1 (b);",
+ "CREATE INDEX\n"
+ ]
+ )
+);
+
+$node->stop;
diff --git a/src/test/default_collation/icu/.gitignore b/src/test/default_collation/icu/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/default_collation/icu/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/icu/Makefile b/src/test/default_collation/icu/Makefile
new file mode 100644
index 0000000000..5ee91d8eaf
--- /dev/null
+++ b/src/test/default_collation/icu/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/icu/Makefile
+
+subdir = src/test/default_collation/icu
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/icu/t/001_default_collation.pl b/src/test/default_collation/icu/t/001_default_collation.pl
new file mode 100644
index 0000000000..8b58be3fa5
--- /dev/null
+++ b/src/test/default_collation/icu/t/001_default_collation.pl
@@ -0,0 +1,605 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# check whether ICU can convert C locale to a language tag
+
+my ($in_initdb, $out_initdb, $err_initdb);
+my @command = (qw(initdb -A trust -N -D), $datadir, "--locale=C\@icu");
+print "# Running: " . join(" ", @command) . "\n";
+my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb, \$err_initdb;
+
+my $c_to_icu_language_tag = (
+ not $err_initdb =~ /ICU error: could not convert locale name "C" to language tag: U_ILLEGAL_ARGUMENT_ERROR/);
+
+# get the number of tests
+
+plan tests => $c_to_icu_language_tag ? 124 : 110;
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"$expected_collprovider\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+# --locale
+
+test_initdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "libc",
+ "");
+
+test_initdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX libc locale",
+ "--locale=POSIX\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+test_initdb(
+ "POSIX icu locale",
+ "--locale=POSIX\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+test_initdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "",
+ "invalid locale name \"C\@icu\"");
+
+test_initdb(
+ "ICU language tag format locale",
+ "--locale=und-x-icu",
+ "",
+ "invalid locale name \"und-x-icu\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_initdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ "libc",
+ "");
+
+test_initdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "POSIX libc lc_collate",
+ "--lc-collate=POSIX\@libc",
+ "libc",
+ "");
+
+test_initdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu --lc-ctype=C",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));;
+
+test_initdb(
+ "POSIX icu lc_collate",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));;
+
+test_initdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "",
+ "invalid locale name \"C\@icu\"");
+
+test_initdb(
+ "ICU language tag format lc_collate",
+ "--lc-collate=und-x-icu",
+ "",
+ "invalid locale name \"und-x-icu\"");
+
+# --locale & --lc-collate
+
+test_initdb(
+ "lc_collate implicit provider takes precedence",
+ "--locale=\@icu --lc-collate=C",
+ "libc",
+ "");
+
+test_initdb(
+ "lc_collate explicit provider takes precedence",
+ "--locale=C\@libc --lc-collate=C\@icu",
+ "",
+ ($c_to_icu_language_tag ?
+ "selected encoding \\(SQL_ASCII\\) is not supported for ICU locales" :
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR"));
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $expected_collprovider, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb",
+ split(" ", $options),
+ "--template=template0",
+ "mydb");
+
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "createdb: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name,
+ $createdb_options,
+ $psql_options,
+ $expected_collprovider,
+ $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("psql",
+ split(" ", $psql_options),
+ "-c",
+ "create database mydb "
+ . $createdb_options
+ . " template = template0;");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($expected_collprovider eq "libc")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+ elsif ($expected_collprovider eq "icu")
+ {
+ like($out_command,
+ qr{\@$expected_collprovider([\.\d]+)?\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+@command = ("createuser --createdb --no-superuser non_superuser");
+print "# Running: " . join(" ", @command) . "\n";
+system(@command);
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "libc",
+ "");
+
+test_createdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX libc locale",
+ "--locale=POSIX\@libc",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "C icu locale with SQL_ASCII encoding and superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "C icu locale with SQL_ASCII encoding and superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "C icu locale with SQL_ASCII encoding and non-superuser",
+ "--locale=C\@icu --encoding=SQL_ASCII --username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "POSIX icu locale with SQL_ASCII encoding and non-superuser",
+ "--locale=POSIX\@icu --encoding=SQL_ASCII --username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_createdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_createdb(
+ "ICU language tag format locale",
+ "--locale=und-x-icu",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+# --lc-collate with the same --lc-ctype if needed
+
+test_createdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "libc",
+ "");
+
+test_createdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C --lc-ctype=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX --lc-ctype=POSIX",
+ "libc",
+ "");
+
+test_createdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc --lc-ctype=C",
+ "libc",
+ "");
+
+test_createdb(
+ "POSIX libc lc_collate",
+ "--lc-collate=POSIX\@libc --lc-ctype=POSIX",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII",
+ "icu",
+ "");
+}
+else
+{
+ test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "C icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "--lc-collate=C\@icu --lc-ctype=C --encoding=SQL_ASCII "
+ . "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII",
+ "icu",
+ "");
+
+}
+else
+{
+ test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_createdb(
+ "POSIX icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "--lc-collate=POSIX\@icu --lc-ctype=POSIX --encoding=SQL_ASCII "
+ . "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_createdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_createdb(
+ "ICU language tag format lc_collate",
+ "--lc-collate=und-x-icu",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+# test CREATE DATABASE
+
+# LC_COLLATE with the same LC_CTYPE if needed
+
+test_create_database(
+ "empty libc lc_collate",
+ "LC_COLLATE = '\@libc'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "C lc_collate without collation provider",
+ "LC_COLLATE = 'C' LC_CTYPE = 'C'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "POSIX lc_collate without collation provider",
+ "LC_COLLATE = 'POSIX' LC_CTYPE = 'POSIX'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "C libc lc_collate",
+ "LC_COLLATE = 'C\@libc' LC_CTYPE = 'C'",
+ "",
+ "libc",
+ "");
+
+test_create_database(
+ "POSIX libc lc_collate",
+ "LC_COLLATE = 'POSIX\@libc' LC_CTYPE = 'POSIX'",
+ "",
+ "libc",
+ "");
+
+if ($c_to_icu_language_tag)
+{
+ test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "",
+ "icu",
+ "");
+}
+else
+{
+ test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_create_database(
+ "C icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "LC_COLLATE = 'C\@icu' LC_CTYPE = 'C' ENCODING = 'SQL_ASCII'",
+ "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+if ($c_to_icu_language_tag)
+{
+ test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "",
+ "icu",
+ "");
+}
+else
+{
+ test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "",
+ "",
+ "ICU error: could not convert locale name \"C\" to language tag: U_ILLEGAL_ARGUMENT_ERROR");
+}
+
+test_create_database(
+ "POSIX icu lc_collate with SQL_ASCII encoding and non-superuser",
+ "LC_COLLATE = 'POSIX\@icu' LC_CTYPE = 'POSIX' ENCODING = 'SQL_ASCII'",
+ "--username=non_superuser",
+ "",
+ "encoding \"SQL_ASCII\" is not supported for ICU locales");
+
+test_create_database(
+ "C lc_collate too many modifiers",
+ "LC_COLLATE = 'C\@icu\@libc'",
+ "",
+ "",
+ "invalid locale name: \"C\@icu\"");
+
+test_create_database(
+ "ICU language tag format lc_collate",
+ "LC_COLLATE = 'und-x-icu'",
+ "",
+ "",
+ "invalid locale name: \"und-x-icu\"");
+
+$node->stop;
diff --git a/src/test/default_collation/libc.utf8/.gitignore b/src/test/default_collation/libc.utf8/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/libc.utf8/Makefile b/src/test/default_collation/libc.utf8/Makefile
new file mode 100644
index 0000000000..e5b9d20958
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/libc.utf8/Makefile
+
+subdir = src/test/default_collation/libc.utf8
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/libc.utf8/t/001_default_collation.pl b/src/test/default_collation/libc.utf8/t/001_default_collation.pl
new file mode 100644
index 0000000000..e4b3552922
--- /dev/null
+++ b/src/test/default_collation/libc.utf8/t/001_default_collation.pl
@@ -0,0 +1,703 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 168;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"libc\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+sub psql
+{
+ my ($command, $db) = @_;
+ my ($result, $in, $out, $err);
+ my @psql = ('psql', '-X', '-c', $command);
+ if (defined($db))
+ {
+ push(@psql, $db);
+ }
+ print "# Running: " . join(" ", @psql) . "\n";
+ $result = IPC::Run::run \@psql, \$in, \$out, \$err;
+ ($result, $out, $err);
+}
+
+# --locale
+
+test_initdb(
+ "be_BY\@latin libc locale",
+ "--locale=be_BY\@latin\@libc",
+ "");
+
+test_initdb(
+ "be_BY\@latin libc locale invalid modifier order",
+ "--locale=be_BY\@libc\@latin",
+ "invalid locale name \"be_BY\@libc\@latin\"");
+
+# --lc-collate
+
+test_initdb(
+ "be_BY\@latin libc lc_collate",
+ "--lc-collate=be_BY\@latin\@libc",
+ "");
+
+test_initdb(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@libc\@latin",
+ "invalid locale name \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test createdb, CREATE DATABASE and default collation behaviour
+
+sub test_createdb
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ if ($from_template0)
+ {
+ $options = $options . " --template=template0";
+ }
+
+ @command = ("createdb", split(" ", $options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command,
+ qr{\@libc\n},
+ "createdb: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ ($result, $out_command, $err_command) = psql(
+ "create database mydb "
+ . $options
+ . ($from_template0 ? " TEMPLATE = template0;" : ";"));
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command,
+ qr{\@libc\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_default_collation
+{
+ my ($createdb_options, $collation, @commands) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("createdb", split(" ", $createdb_options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ ($result, $out_command, $err_command) = psql(
+ "select datcollate from pg_database where datname = 'mydb';");
+
+ like($out_command, qr{\@libc\n}, "\"@command\" check output");
+
+ for (my $row = 0; $row <= $#commands; $row++)
+ {
+ my ($command_text, $expected) = @{$commands[$row]};
+ ($result, $out_command, $err_command) = psql($command_text, "mydb");
+
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ if ($out_command)
+ {
+ is(
+ $out_command,
+ $expected,
+ "default collation "
+ . $collation
+ . ": \""
+ . $command_text
+ . "\" check output");
+ }
+ }
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+# test createdb
+
+# --locale
+
+test_createdb(
+ "be_BY\@latin libc locale",
+ "--locale=be_BY\@latin\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "be_BY\@latin libc locale invalid modifier order",
+ "--locale=be_BY\@libc\@latin",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\"");
+
+# --lc-collate
+
+test_createdb(
+ "be_BY\@latin libc lc_collate",
+ "--lc-collate=be_BY\@latin\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "--lc-collate=be_BY\@libc\@latin",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+# LC_COLLATE
+
+test_create_database(
+ "be_BY\@latin libc lc_collate",
+ "LC_COLLATE = 'be_BY\@latin\@libc'",
+ 1,
+ "");
+
+test_create_database(
+ "be_BY\@latin libc lc_collate invalid modifier order",
+ "LC_COLLATE = 'be_BY\@libc\@latin'",
+ 1,
+ "invalid locale name: \"be_BY\@libc\@latin\" \\(provider \"libc\"\\)");
+
+# test default collation behaviour
+# use commands and outputs from the regression test collate.linux.utf8
+
+test_default_collation(
+ "--lc-collate=en_US.utf8\@libc --template=template0",
+ "en_US.utf8\@libc",
+ (
+ [
+ "CREATE TABLE collate_test1 (a int, b text NOT NULL);",
+ "CREATE TABLE\n"
+ ],
+ [
+ "INSERT INTO collate_test1 VALUES "
+ . "(1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');",
+ "INSERT 0 4\n"],
+ [
+ "SELECT * FROM collate_test1 WHERE b >= 'bbc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 3 | bbc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # star expansion
+ [
+ "SELECT * FROM collate_test1 ORDER BY b;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # upper/lower
+ ["CREATE TABLE collate_test10 (a int, x text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test10 VALUES (1, 'hij'), (2, 'HIJ');",
+ "INSERT 0 2\n"
+ ],
+ [
+ "SELECT a, lower(x), upper(x), initcap(x) FROM collate_test10;",
+ " a | lower | upper | initcap \n"
+ . "---+-------+-------+---------\n"
+ . " 1 | hij | HIJ | Hij\n"
+ . " 2 | hij | HIJ | Hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ # LIKE/ILIKE
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b LIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ILIKE '%KI%' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ILIKE 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ # regular expressions
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~ 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(3 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc\$';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* '^abc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT * FROM collate_test1 WHERE b ~* 'bc';",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 4 | ABC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE TABLE collate_test6 (a int, b text);", "CREATE TABLE\n", ""],
+ [
+ "INSERT INTO collate_test6 VALUES "
+ . "(1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'), "
+ . "(5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, ' '), "
+ . "(9, 'äbç'), (10, 'ÄBÇ');",
+ "INSERT 0 10\n"
+ ],
+ [
+ "SELECT b, "
+ . "b ~ '^[[:alpha:]]+\$' AS is_alpha, "
+ . "b ~ '^[[:upper:]]+\$' AS is_upper, "
+ . "b ~ '^[[:lower:]]+\$' AS is_lower, "
+ . "b ~ '^[[:digit:]]+\$' AS is_digit, "
+ . "b ~ '^[[:alnum:]]+\$' AS is_alnum, "
+ . "b ~ '^[[:graph:]]+\$' AS is_graph, "
+ . "b ~ '^[[:print:]]+\$' AS is_print, "
+ . "b ~ '^[[:punct:]]+\$' AS is_punct, "
+ . "b ~ '^[[:space:]]+\$' AS is_space "
+ . "FROM collate_test6;",
+ " b | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space \n"
+ . "-----+----------+----------+----------+----------+----------+----------+----------+----------+----------\n"
+ . " abc | t | f | t | f | t | t | t | f | f\n"
+ . " ABC | t | t | f | f | t | t | t | f | f\n"
+ . " 123 | f | f | f | t | t | t | t | f | f\n"
+ . " ab1 | f | f | f | f | t | t | t | f | f\n"
+ . " a1! | f | f | f | f | f | t | t | f | f\n"
+ . " a c | f | f | f | f | f | f | t | f | f\n"
+ . " !.; | f | f | f | f | f | t | t | t | f\n"
+ . " | f | f | f | f | f | f | t | f | t\n"
+ . " äbç | t | f | t | f | t | t | t | f | f\n"
+ . " ÄBÇ | t | t | f | f | t | t | t | f | f\n"
+ . "(10 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'Türkiye' ~* 'KI' AS \"true\";",
+ " true \n"
+ . "------\n"
+ . " t\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT 'bıt' ~* 'BIT' AS \"false\";",
+ " false \n"
+ . "-------\n"
+ . " f\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(coalesce(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;",
+ " a | b | greatest \n"
+ . "---+-----+----------\n"
+ . " 1 | abc | CCC\n"
+ . " 2 | äbc | CCC\n"
+ . " 3 | bbc | CCC\n"
+ . " 4 | ABC | CCC\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, x, lower(greatest(x, 'foo')) FROM collate_test10;",
+ " a | x | lower \n"
+ . "---+-----+-------\n"
+ . " 1 | hij | hij\n"
+ . " 2 | HIJ | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;",
+ " a | nullif \n"
+ . "---+--------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 1 | \n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(nullif(x, 'foo')) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END "
+ . "FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+------\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 1 | abcd\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ ["CREATE DOMAIN testdomain AS text;", "CREATE DOMAIN\n", ""],
+ [
+ "SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, lower(x::testdomain) FROM collate_test10;",
+ " a | lower \n"
+ . "---+-------\n"
+ . " 1 | hij\n"
+ . " 2 | hij\n"
+ . "(2 rows)\n"
+ . "\n"
+ ],
+ [
+ "SELECT min(b), max(b) FROM collate_test1;",
+ " min | max \n"
+ . "-----+-----\n"
+ . " abc | bbc\n"
+ . "(1 row)\n"
+ . "\n",
+ ""
+ ],
+ [
+ "SELECT array_agg(b ORDER BY b) FROM collate_test1;",
+ " array_agg \n"
+ . "-------------------\n"
+ . " {abc,ABC,äbc,bbc}\n"
+ . "(1 row)\n"
+ . "\n"
+ ],
+ [
+ "SELECT a, b FROM collate_test1 "
+ . "UNION ALL "
+ . "SELECT a, b FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . " 3 | bbc\n"
+ . "(8 rows)\n"
+ . "\n"
+ ],
+ # casting
+ [
+ "SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;",
+ " a | b \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # propagation of collation in SQL functions (inlined and non-inlined
+ # cases) and plpgsql functions too
+ [
+ "CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_noninline (text, text) "
+ . "RETURNS boolean LANGUAGE sql "
+ . "AS \$\$ select \$1 < \$2 limit 1 \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "CREATE FUNCTION mylt_plpgsql (text, text) "
+ . "RETURNS boolean LANGUAGE plpgsql "
+ . "AS \$\$ begin return \$1 < \$2; end \$\$;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a.b AS a, b.b AS b, a.b < b.b AS lt, "
+ . "mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b) "
+ . "FROM collate_test1 a, collate_test1 b "
+ . "ORDER BY a.b, b.b;",
+ " a | b | lt | mylt | mylt_noninline | mylt_plpgsql \n"
+ . "-----+-----+----+------+----------------+--------------\n"
+ . " abc | abc | f | f | f | f\n"
+ . " abc | ABC | t | t | t | t\n"
+ . " abc | äbc | t | t | t | t\n"
+ . " abc | bbc | t | t | t | t\n"
+ . " ABC | abc | f | f | f | f\n"
+ . " ABC | ABC | f | f | f | f\n"
+ . " ABC | äbc | t | t | t | t\n"
+ . " ABC | bbc | t | t | t | t\n"
+ . " äbc | abc | f | f | f | f\n"
+ . " äbc | ABC | f | f | f | f\n"
+ . " äbc | äbc | f | f | f | f\n"
+ . " äbc | bbc | t | t | t | t\n"
+ . " bbc | abc | f | f | f | f\n"
+ . " bbc | ABC | f | f | f | f\n"
+ . " bbc | äbc | f | f | f | f\n"
+ . " bbc | bbc | f | f | f | f\n"
+ . "(16 rows)\n"
+ . "\n"
+ ],
+ # polymorphism
+ [
+ "SELECT * FROM unnest("
+ . "(SELECT array_agg(b ORDER BY b) FROM collate_test1)"
+ . ") ORDER BY 1;",
+ " unnest \n"
+ . "--------\n"
+ . " abc\n"
+ . " ABC\n"
+ . " äbc\n"
+ . " bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ [
+ "CREATE FUNCTION dup (anyelement) RETURNS anyelement "
+ . "AS 'select \$1' LANGUAGE sql;",
+ "CREATE FUNCTION\n"
+ ],
+ [
+ "SELECT a, dup(b) FROM collate_test1 ORDER BY 2;",
+ " a | dup \n"
+ . "---+-----\n"
+ . " 1 | abc\n"
+ . " 4 | ABC\n"
+ . " 2 | äbc\n"
+ . " 3 | bbc\n"
+ . "(4 rows)\n"
+ . "\n"
+ ],
+ # indexes
+ [
+ "CREATE INDEX collate_test1_idx1 ON collate_test1 (b);",
+ "CREATE INDEX\n"
+ ]
+ )
+);
+
+$node->stop;
diff --git a/src/test/default_collation/libc/.gitignore b/src/test/default_collation/libc/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/default_collation/libc/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/default_collation/libc/Makefile b/src/test/default_collation/libc/Makefile
new file mode 100644
index 0000000000..98ab736d7a
--- /dev/null
+++ b/src/test/default_collation/libc/Makefile
@@ -0,0 +1,11 @@
+# src/test/default_collation/libc/Makefile
+
+subdir = src/test/default_collation/libc
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+ $(prove_check)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/default_collation/libc/t/001_default_collation.pl b/src/test/default_collation/libc/t/001_default_collation.pl
new file mode 100644
index 0000000000..bc8a6ad02c
--- /dev/null
+++ b/src/test/default_collation/libc/t/001_default_collation.pl
@@ -0,0 +1,355 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 90;
+
+my $tempdir = TestLib::tempdir;
+my $datadir = "$tempdir/data";
+
+# test initdb
+
+sub test_initdb
+{
+ my ($test_name, $options, $error_message) = @_;
+ my ($in_initdb, $out_initdb, $err_initdb);
+
+ mkdir $datadir;
+
+ my @command = (qw(initdb -A trust -N -D), $datadir, split(" ", $options));
+ print "# Running: " . join(" ", @command) . "\n";
+ my $result = IPC::Run::run \@command, \$in_initdb, \$out_initdb,
+ \$err_initdb;
+
+ if ($error_message)
+ {
+ like($err_initdb,
+ qr{$error_message},
+ "initdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_initdb, "", "\"@command\" no stderr");
+ like($out_initdb,
+ qr{The default collation provider is \"libc\"\.},
+ "initdb: $test_name: check output");
+ }
+
+ File::Path::rmtree $datadir;
+}
+
+# empty locales
+
+test_initdb(
+ "empty locales",
+ "",
+ "");
+
+# --locale
+
+test_initdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ "");
+
+test_initdb(
+ "C locale without collation provider",
+ "--locale=C",
+ "");
+
+test_initdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ "");
+
+test_initdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ "");
+
+test_initdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ "ICU is not supported in this build");
+
+test_initdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ "invalid locale name \"C\@icu\"");
+
+# --lc-collate
+
+test_initdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ "");
+
+test_initdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ "");
+
+test_initdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ "");
+
+test_initdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ "");
+
+test_initdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu",
+ "ICU is not supported in this build");
+
+test_initdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ "invalid locale name \"C\@icu\" \\(provider \"libc\"\\)");
+
+# --locale & --lc-collate
+
+test_initdb(
+ "lc_collate implicit provider takes precedence",
+ "--locale=\@icu --lc-collate=C",
+ "");
+
+test_initdb(
+ "lc_collate explicit provider takes precedence",
+ "--locale=\@icu --lc-collate=\@libc",
+ "");
+
+# test createdb and CREATE DATABASE
+
+sub test_createdb
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ if ($from_template0)
+ {
+ $options = $options . " --template=template0";
+ }
+
+ @command = ("createdb", split(" ", $options), "mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "createdb: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ like($out_command,
+ qr{\@libc\n},
+ "createdb: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+sub test_create_database
+{
+ my ($test_name, $options, $from_template0, $error_message) = @_;
+ my (@command, $result, $in_command, $out_command, $err_command);
+
+ @command = ("psql",
+ "-c",
+ "create database mydb "
+ . $options
+ . ($from_template0 ? " template = template0" : "")
+ . ";");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ if ($error_message)
+ {
+ like($err_command,
+ qr{$error_message},
+ "CREATE DATABASE: $test_name: check error message");
+ }
+ else
+ {
+ ok($result, "\"@command\" exit code 0");
+ is($err_command, "", "\"@command\" no stderr");
+ like($out_command, qr{CREATE DATABASE}, "\"@command\" check output");
+
+ @command = (
+ "psql",
+ "-c",
+ "select datcollate from pg_database where datname = 'mydb';");
+ print "# Running: " . join(" ", @command) . "\n";
+ $result = IPC::Run::run \@command, \$in_command, \$out_command,
+ \$err_command;
+
+ like($out_command,
+ qr{\@libc\n},
+ "CREATE DATABASE: $test_name: check pg_database.datcollate");
+
+ @command = ("dropdb mydb");
+ print "# Running: " . join(" ", @command) . "\n";
+ system(@command);
+ }
+}
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+local $ENV{PGPORT} = $node->port;
+
+# test createdb
+
+# empty locales
+
+test_createdb(
+ "empty locales",
+ "",
+ 0,
+ "");
+
+# --locale
+
+test_createdb(
+ "empty libc locale",
+ "--locale=\@libc",
+ 0,
+ "");
+
+test_createdb(
+ "C locale without collation provider",
+ "--locale=C",
+ 1,
+ "");
+
+test_createdb(
+ "POSIX locale without collation provider",
+ "--locale=POSIX",
+ 1,
+ "");
+
+test_createdb(
+ "C libc locale",
+ "--locale=C\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "C icu locale",
+ "--locale=C\@icu",
+ 1,
+ "ICU is not supported in this build");
+
+test_createdb(
+ "C locale too many modifiers",
+ "--locale=C\@icu\@libc",
+ 1,
+ "invalid locale name: \"C\@icu\"");
+
+# --lc-collate
+
+test_createdb(
+ "empty libc lc_collate",
+ "--lc-collate=\@libc",
+ 0,
+ "");
+
+test_createdb(
+ "C lc_collate without collation provider",
+ "--lc-collate=C",
+ 1,
+ "");
+test_createdb(
+ "POSIX lc_collate without collation provider",
+ "--lc-collate=POSIX",
+ 1,
+ "");
+
+test_createdb(
+ "C libc lc_collate",
+ "--lc-collate=C\@libc",
+ 1,
+ "");
+
+test_createdb(
+ "C icu lc_collate",
+ "--lc-collate=C\@icu",
+ 1,
+ "ICU is not supported in this build");
+
+test_createdb(
+ "C lc_collate too many modifiers",
+ "--lc-collate=C\@icu\@libc",
+ 1,
+ "invalid locale name: \"C\@icu\" \\(provider \"libc\"\\)");
+
+# test CREATE DATABASE
+
+# empty locales
+
+test_create_database(
+ "empty locales",
+ "",
+ 0,
+ "");
+
+# LC_COLLATE
+
+test_create_database(
+ "empty libc lc_collate",
+ "LC_COLLATE = '\@libc'",
+ 0,
+ "");
+
+test_create_database(
+ "C lc_collate without collation provider",
+ "LC_COLLATE = 'C'",
+ 1,
+ "");
+test_create_database(
+ "POSIX lc_collate without collation provider",
+ "LC_COLLATE = 'POSIX'",
+ 1,
+ "");
+
+test_create_database(
+ "C libc lc_collate",
+ "LC_COLLATE = 'C\@libc'",
+ 1,
+ "");
+
+test_create_database(
+ "C icu lc_collate",
+ "LC_COLLATE = 'C\@icu'",
+ 1,
+ "ICU is not supported in this build");
+
+test_create_database(
+ "C lc_collate too many modifiers",
+ "LC_COLLATE = 'C\@icu\@libc'",
+ 1,
+ "invalid locale name: \"C\@icu\" \\(provider \"libc\"\\)");
+
+$node->stop;
diff --git a/src/test/isolation/Makefile b/src/test/isolation/Makefile
index c3c8280ea2..d4dd6d57b5 100644
--- a/src/test/isolation/Makefile
+++ b/src/test/isolation/Makefile
@@ -10,6 +10,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -I$(srcdir) -I$(libpq_srcdir) -I$(srcdir)/../regress $(CPPFLAGS)
+LDFLAGS_INTERNAL += $(ICU_LIBS)
OBJS = specparse.o isolationtester.o $(WIN32RES)
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index b66193d1be..f78665c0ef 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -979,11 +979,14 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
-- schema manipulation commands
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -991,7 +994,7 @@ ERROR: collation "test0" already exists
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out
index d33f04a3b5..82648af029 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.linux.utf8.out
@@ -988,11 +988,14 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
-- schema manipulation commands
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -1004,7 +1007,7 @@ NOTICE: collation "test0" for encoding "UTF8" already exists, skipping
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 68c2d69659..5471bf92ba 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -339,18 +339,22 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
+
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.linux.utf8.sql
index e882153244..77fef8d268 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.linux.utf8.sql
@@ -339,11 +339,15 @@ SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_t
CREATE ROLE regress_test_role;
CREATE SCHEMA test_schema;
+-- remove provider modifier and collation version
+CREATE FUNCTION get_lc_collate (text) RETURNS text LANGUAGE sql
+ AS $$ select substring($1 from '(.*)@[^@]+$') $$;
+
-- We need to do this this way to cope with varying names for encodings:
do $$
BEGIN
EXECUTE 'CREATE COLLATION test0 (locale = ' ||
- quote_literal(current_setting('lc_collate')) || ');';
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) || ');';
END
$$;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
@@ -352,7 +356,7 @@ CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
do $$
BEGIN
EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
+ quote_literal(get_lc_collate(current_setting('lc_collate'))) ||
', lc_ctype = ' ||
quote_literal(current_setting('lc_ctype')) || ');';
END
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 726f2ba167..bbdfffe8c1 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -51,12 +51,19 @@ my @contrib_excludes = (
'snapshot_too_old');
# Set of variables for frontend modules
-my $frontend_defines = { 'initdb' => 'FRONTEND' };
+my $frontend_defines = {
+ 'initdb' => 'FRONTEND',
+ 'psql' => 'FRONTEND',
+ 'pg_dump' => 'FRONTEND',
+ 'pg_dumpall' => 'FRONTEND',
+ 'pg_restore' => 'FRONTEND',
+ };
my @frontend_uselibpq = ('pg_ctl', 'pg_upgrade', 'pgbench', 'psql', 'initdb');
my @frontend_uselibpgport = (
'pg_archivecleanup', 'pg_test_fsync',
'pg_test_timing', 'pg_upgrade',
'pg_waldump', 'pgbench');
+my @iculibs = ('icuin.lib', 'icuuc.lib');
my @frontend_uselibpgcommon = (
'pg_archivecleanup', 'pg_test_fsync',
'pg_test_timing', 'pg_upgrade',
@@ -65,8 +72,10 @@ my $frontend_extralibs = {
'initdb' => ['ws2_32.lib'],
'pg_restore' => ['ws2_32.lib'],
'pgbench' => ['ws2_32.lib'],
+ 'mchar' => [@iculibs],
'psql' => ['ws2_32.lib']
};
+my @frontend_iculibs = ('initdb', 'pg_upgrade');
my $frontend_extraincludes = {
'initdb' => ['src/timezone'],
'psql' => ['src/backend']
@@ -120,7 +129,7 @@ sub mkvcbuild
our @pgcommonallfiles = qw(
base64.c config_info.c controldata_utils.c d2s.c exec.c f2s.c file_perm.c ip.c
- keywords.c kwlookup.c link-canary.c md5.c
+ keywords.c kwlookup.c link-canary.c md5.c pg_collation_fn_common.c
pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c
saslprep.c scram-common.c string.c unicode_norm.c username.c
wait_error.c);
@@ -155,6 +164,7 @@ sub mkvcbuild
$libpgfeutils->AddDefine('FRONTEND');
$libpgfeutils->AddIncludeDir('src/interfaces/libpq');
$libpgfeutils->AddFiles('src/fe_utils', @pgfeutilsfiles);
+ $libpgfeutils->AddFile('src/common/pg_collation_fn_common.c');
$postgres = $solution->AddProject('postgres', 'exe', '', 'src/backend');
$postgres->AddIncludeDir('src/backend');
@@ -238,6 +248,7 @@ sub mkvcbuild
'src/interfaces/libpq');
$libpq->AddDefine('FRONTEND');
$libpq->AddDefine('UNSAFE_STAT_OK');
+ $libpq->AddDefine('LIBPQ_MAKE');
$libpq->AddIncludeDir('src/port');
$libpq->AddLibrary('secur32.lib');
$libpq->AddLibrary('ws2_32.lib');
@@ -246,6 +257,7 @@ sub mkvcbuild
$libpq->ReplaceFile('src/interfaces/libpq/libpqrc.c',
'src/interfaces/libpq/libpq.rc');
$libpq->AddReference($libpgcommon, $libpgport);
+ $libpq->AddFile('src/common/pg_collation_fn_common.c');
# The OBJS scraper doesn't know about ifdefs, so remove appropriate files
# if building without OpenSSL.
@@ -425,6 +437,12 @@ sub mkvcbuild
{
push @contrib_excludes, 'uuid-ossp';
}
+ else
+ {
+ foreach my $fe (@frontend_iculibs) {
+ push @{$frontend_extralibs->{$fe}}, @iculibs;
+ }
+ }
# AddProject() does not recognize the constructs used to populate OBJS in
# the pgcrypto Makefile, so it will discover no files.
On 2019-03-21 18:46, Marius Timmer wrote:
as I mentioned three weeks ago the patch from October 2018 did not apply
on the master. In the meantime I rebased it. Additionally I fixed some
Makefiles because a few icu-libs were missing. Now this patch applies
and compiles successfully on my machine. After installing running "make
installcheck-world" results in some failures (for example "select"). I
will take a closer look at those failures and review the whole patch in
the next few days. I just wanted to avoid that you have to do the same
rebasing stuff. The new patch is attached to this mail.
As I said previously in this thread, this patch needs some fundamental
design work. I don't think it's worth doing a code review on the patch
as it is.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services