ICU integration

Started by Peter Eisentrautover 9 years ago83 messages

peter.eisentraut@2ndquadrant.com

over 9 years ago

1 attachment(s)

Here is a patch I've been working on to allow the use of ICU for sorting
and other locale things.

This is mostly complementary to the existing FreeBSD ICU patch, most
recently discussed in [0]/messages/by-id/789A2F56-0E42-409D-A840-6AF5110D6085@pingpong.net. While that patch removes the POSIX locale
use and replaces it with ICU, my interest was on allowing the use of
both. I think that is necessary for upgrading, compatibility, and maybe
because someone likes it.

What I have done is extend collation objects with a collprovider column
that tells whether the collation is using POSIX (appropriate name?) or
ICU facilities. The pg_locale_t type is changed to a struct that
contains the provider-specific locale handles. Users of locale
information are changed to look into that struct for the appropriate
handle to use.

In initdb, I initialize the default collation set as before from the
`locale -a` output, but also add all available ICU locales with a "%icu"
appended (so "fr_FR%icu"). I suppose one could create a configuration
option perhaps in initdb to change the default so that, say, "fr_FR"
uses ICU and "fr_FR%posix" uses the old stuff.

That all works well enough for named collations and for sorting. The
thread about the FreeBSD ICU patch discusses some details of how to best
use the ICU APIs to do various aspects of the sorting, so I didn't focus
on that too much. I took the existing collate.linux.utf8.sql test and
ported it to the ICU setup, and it passes except for the case noted below.

I'm not sure how well it will work to replace all the bits of LIKE and
regular expressions with ICU API calls. One problem is that ICU likes
to do case folding as a whole string, not by character. I need to do
more research about that. Another problem, which was also previously
discussed is that ICU does case folding in a locale-agnostic manner, so
it does not consider things such as the Turkish special cases. This is
per Unicode standard modulo weasel wording, but it breaks existing tests
at least.

So right now the entries in collcollate and collctype need to be valid
for ICU *and* POSIX for everything to work.

Also note that ICU locales are encoding-independent and don't support a
separate collcollate and collctype, so the existing catalog structure is
not optimal.

Where it gets really interesting is what to do with the database
locales. They just set the global process locale. So in order to port
that to ICU we'd need to check every implicit use of the process locale
and tweak it. We could add a datcollprovider column or something. But
we also rely on the datctype setting to validate the encoding of the
database. Maybe we wouldn't need that anymore, but it sounds risky.

We could have a datcollation column that by OID references a collation
defined inside the database. With a background worker, we can log into
the database as it is being created and make adjustments, including
defining or adjusting collation definitions. This would open up
interesting new possibilities.

What is a way to go forward here? What's a minimal useful feature that
is future-proof? Just allow named collations referencing ICU for now?
Is throwing out POSIX locales even for the process locale reasonable?

Oh, that case folding code in formatting.c needs some refactoring.
There are so many ifdefs there and it's repeated almost identically
three times, it's crazy to work in that.

[0]: /messages/by-id/789A2F56-0E42-409D-A840-6AF5110D6085@pingpong.net
/messages/by-id/789A2F56-0E42-409D-A840-6AF5110D6085@pingpong.net

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

icu-integration.patchtext/x-patch; name=icu-integration.patchDownload

diff --git a/aclocal.m4 b/aclocal.m4
index 6f930b6..5ca902b 100644
--- a/aclocal.m4
+++ b/aclocal.m4
@@ -7,6 +7,7 @@ m4_include([config/docbook.m4])
 m4_include([config/general.m4])
 m4_include([config/libtool.m4])
 m4_include([config/perl.m4])
+m4_include([config/pkg.m4])
 m4_include([config/programs.m4])
 m4_include([config/python.m4])
 m4_include([config/tcl.m4])
diff --git a/config/pkg.m4 b/config/pkg.m4
new file mode 100644
index 0000000..82bea96
--- /dev/null
+++ b/config/pkg.m4
@@ -0,0 +1,275 @@
+dnl pkg.m4 - Macros to locate and utilise pkg-config.   -*- Autoconf -*-
+dnl serial 11 (pkg-config-0.29.1)
+dnl
+dnl Copyright © 2004 Scott James Remnant <scott@netsplit.com>.
+dnl Copyright © 2012-2015 Dan Nicholson <dbn.lists@gmail.com>
+dnl
+dnl This program is free software; you can redistribute it and/or modify
+dnl it under the terms of the GNU General Public License as published by
+dnl the Free Software Foundation; either version 2 of the License, or
+dnl (at your option) any later version.
+dnl
+dnl This program is distributed in the hope that it will be useful, but
+dnl WITHOUT ANY WARRANTY; without even the implied warranty of
+dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+dnl General Public License for more details.
+dnl
+dnl You should have received a copy of the GNU General Public License
+dnl along with this program; if not, write to the Free Software
+dnl Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+dnl 02111-1307, USA.
+dnl
+dnl As a special exception to the GNU General Public License, if you
+dnl distribute this file as part of a program that contains a
+dnl configuration script generated by Autoconf, you may include it under
+dnl the same distribution terms that you use for the rest of that
+dnl program.
+
+dnl PKG_PREREQ(MIN-VERSION)
+dnl -----------------------
+dnl Since: 0.29
+dnl
+dnl Verify that the version of the pkg-config macros are at least
+dnl MIN-VERSION. Unlike PKG_PROG_PKG_CONFIG, which checks the user's
+dnl installed version of pkg-config, this checks the developer's version
+dnl of pkg.m4 when generating configure.
+dnl
+dnl To ensure that this macro is defined, also add:
+dnl m4_ifndef([PKG_PREREQ],
+dnl     [m4_fatal([must install pkg-config 0.29 or later before running autoconf/autogen])])
+dnl
+dnl See the "Since" comment for each macro you use to see what version
+dnl of the macros you require.
+m4_defun([PKG_PREREQ],
+[m4_define([PKG_MACROS_VERSION], [0.29.1])
+m4_if(m4_version_compare(PKG_MACROS_VERSION, [$1]), -1,
+    [m4_fatal([pkg.m4 version $1 or higher is required but ]PKG_MACROS_VERSION[ found])])
+])dnl PKG_PREREQ
+
+dnl PKG_PROG_PKG_CONFIG([MIN-VERSION])
+dnl ----------------------------------
+dnl Since: 0.16
+dnl
+dnl Search for the pkg-config tool and set the PKG_CONFIG variable to
+dnl first found in the path. Checks that the version of pkg-config found
+dnl is at least MIN-VERSION. If MIN-VERSION is not specified, 0.9.0 is
+dnl used since that's the first version where most current features of
+dnl pkg-config existed.
+AC_DEFUN([PKG_PROG_PKG_CONFIG],
+[m4_pattern_forbid([^_?PKG_[A-Z_]+$])
+m4_pattern_allow([^PKG_CONFIG(_(PATH|LIBDIR|SYSROOT_DIR|ALLOW_SYSTEM_(CFLAGS|LIBS)))?$])
+m4_pattern_allow([^PKG_CONFIG_(DISABLE_UNINSTALLED|TOP_BUILD_DIR|DEBUG_SPEW)$])
+AC_ARG_VAR([PKG_CONFIG], [path to pkg-config utility])
+AC_ARG_VAR([PKG_CONFIG_PATH], [directories to add to pkg-config's search path])
+AC_ARG_VAR([PKG_CONFIG_LIBDIR], [path overriding pkg-config's built-in search path])
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	AC_PATH_TOOL([PKG_CONFIG], [pkg-config])
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=m4_default([$1], [0.9.0])
+	AC_MSG_CHECKING([pkg-config is at least version $_pkg_min_version])
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		AC_MSG_RESULT([yes])
+	else
+		AC_MSG_RESULT([no])
+		PKG_CONFIG=""
+	fi
+fi[]dnl
+])dnl PKG_PROG_PKG_CONFIG
+
+dnl PKG_CHECK_EXISTS(MODULES, [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------------------------------
+dnl Since: 0.18
+dnl
+dnl Check to see whether a particular set of modules exists. Similar to
+dnl PKG_CHECK_MODULES(), but does not set variables or print errors.
+dnl
+dnl Please remember that m4 expands AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+dnl only at the first occurence in configure.ac, so if the first place
+dnl it's called might be skipped (such as if it is within an "if", you
+dnl have to call PKG_CHECK_EXISTS manually
+AC_DEFUN([PKG_CHECK_EXISTS],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+if test -n "$PKG_CONFIG" && \
+    AC_RUN_LOG([$PKG_CONFIG --exists --print-errors "$1"]); then
+  m4_default([$2], [:])
+m4_ifvaln([$3], [else
+  $3])dnl
+fi])
+
+dnl _PKG_CONFIG([VARIABLE], [COMMAND], [MODULES])
+dnl ---------------------------------------------
+dnl Internal wrapper calling pkg-config via PKG_CONFIG and setting
+dnl pkg_failed based on the result.
+m4_define([_PKG_CONFIG],
+[if test -n "$$1"; then
+    pkg_cv_[]$1="$$1"
+ elif test -n "$PKG_CONFIG"; then
+    PKG_CHECK_EXISTS([$3],
+                     [pkg_cv_[]$1=`$PKG_CONFIG --[]$2 "$3" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes ],
+		     [pkg_failed=yes])
+ else
+    pkg_failed=untried
+fi[]dnl
+])dnl _PKG_CONFIG
+
+dnl _PKG_SHORT_ERRORS_SUPPORTED
+dnl ---------------------------
+dnl Internal check to see if pkg-config supports short errors.
+AC_DEFUN([_PKG_SHORT_ERRORS_SUPPORTED],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi[]dnl
+])dnl _PKG_SHORT_ERRORS_SUPPORTED
+
+
+dnl PKG_CHECK_MODULES(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl --------------------------------------------------------------
+dnl Since: 0.4.0
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES might not happen, you should be sure to include an
+dnl explicit call to PKG_PROG_PKG_CONFIG in your configure.ac
+AC_DEFUN([PKG_CHECK_MODULES],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1][_CFLAGS], [C compiler flags for $1, overriding pkg-config])dnl
+AC_ARG_VAR([$1][_LIBS], [linker flags for $1, overriding pkg-config])dnl
+
+pkg_failed=no
+AC_MSG_CHECKING([for $1])
+
+_PKG_CONFIG([$1][_CFLAGS], [cflags], [$2])
+_PKG_CONFIG([$1][_LIBS], [libs], [$2])
+
+m4_define([_PKG_TEXT], [Alternatively, you may set the environment variables $1[]_CFLAGS
+and $1[]_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.])
+
+if test $pkg_failed = yes; then
+   	AC_MSG_RESULT([no])
+        _PKG_SHORT_ERRORS_SUPPORTED
+        if test $_pkg_short_errors_supported = yes; then
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "$2" 2>&1`
+        else 
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "$2" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$$1[]_PKG_ERRORS" >&AS_MESSAGE_LOG_FD
+
+	m4_default([$4], [AC_MSG_ERROR(
+[Package requirements ($2) were not met:
+
+$$1_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+_PKG_TEXT])[]dnl
+        ])
+elif test $pkg_failed = untried; then
+     	AC_MSG_RESULT([no])
+	m4_default([$4], [AC_MSG_FAILURE(
+[The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+_PKG_TEXT
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.])[]dnl
+        ])
+else
+	$1[]_CFLAGS=$pkg_cv_[]$1[]_CFLAGS
+	$1[]_LIBS=$pkg_cv_[]$1[]_LIBS
+        AC_MSG_RESULT([yes])
+	$3
+fi[]dnl
+])dnl PKG_CHECK_MODULES
+
+
+dnl PKG_CHECK_MODULES_STATIC(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl ---------------------------------------------------------------------
+dnl Since: 0.29
+dnl
+dnl Checks for existence of MODULES and gathers its build flags with
+dnl static libraries enabled. Sets VARIABLE-PREFIX_CFLAGS from --cflags
+dnl and VARIABLE-PREFIX_LIBS from --libs.
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES_STATIC might not happen, you should be sure to
+dnl include an explicit call to PKG_PROG_PKG_CONFIG in your
+dnl configure.ac.
+AC_DEFUN([PKG_CHECK_MODULES_STATIC],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+_save_PKG_CONFIG=$PKG_CONFIG
+PKG_CONFIG="$PKG_CONFIG --static"
+PKG_CHECK_MODULES($@)
+PKG_CONFIG=$_save_PKG_CONFIG[]dnl
+])dnl PKG_CHECK_MODULES_STATIC
+
+
+dnl PKG_INSTALLDIR([DIRECTORY])
+dnl -------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable pkgconfigdir as the location where a module
+dnl should install pkg-config .pc files. By default the directory is
+dnl $libdir/pkgconfig, but the default can be changed by passing
+dnl DIRECTORY. The user can override through the --with-pkgconfigdir
+dnl parameter.
+AC_DEFUN([PKG_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${libdir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([pkgconfigdir],
+    [AS_HELP_STRING([--with-pkgconfigdir], pkg_description)],,
+    [with_pkgconfigdir=]pkg_default)
+AC_SUBST([pkgconfigdir], [$with_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_INSTALLDIR
+
+
+dnl PKG_NOARCH_INSTALLDIR([DIRECTORY])
+dnl --------------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable noarch_pkgconfigdir as the location where a
+dnl module should install arch-independent pkg-config .pc files. By
+dnl default the directory is $datadir/pkgconfig, but the default can be
+dnl changed by passing DIRECTORY. The user can override through the
+dnl --with-noarch-pkgconfigdir parameter.
+AC_DEFUN([PKG_NOARCH_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${datadir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config arch-independent installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([noarch-pkgconfigdir],
+    [AS_HELP_STRING([--with-noarch-pkgconfigdir], pkg_description)],,
+    [with_noarch_pkgconfigdir=]pkg_default)
+AC_SUBST([noarch_pkgconfigdir], [$with_noarch_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_NOARCH_INSTALLDIR
+
+
+dnl PKG_CHECK_VAR(VARIABLE, MODULE, CONFIG-VARIABLE,
+dnl [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------
+dnl Since: 0.28
+dnl
+dnl Retrieves the value of the pkg-config variable for the given module.
+AC_DEFUN([PKG_CHECK_VAR],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1], [value of $3 for $2, overriding pkg-config])dnl
+
+_PKG_CONFIG([$1], [variable="][$3]["], [$2])
+AS_VAR_COPY([$1], [pkg_cv_][$1])
+
+AS_VAR_IF([$1], [""], [$5], [$4])dnl
+])dnl PKG_CHECK_VAR
diff --git a/configure b/configure
index 45c8eef..9690dc6 100755
--- a/configure
+++ b/configure
@@ -716,6 +716,12 @@ krb_srvtab
 with_python
 with_perl
 with_tcl
+ICU_LIBS
+ICU_CFLAGS
+PKG_CONFIG_LIBDIR
+PKG_CONFIG_PATH
+PKG_CONFIG
+with_icu
 enable_thread_safety
 INCLUDES
 autodepend
@@ -820,6 +826,7 @@ with_CC
 enable_depend
 enable_cassert
 enable_thread_safety
+with_icu
 with_tcl
 with_tclconfig
 with_perl
@@ -855,6 +862,11 @@ LDFLAGS
 LIBS
 CPPFLAGS
 CPP
+PKG_CONFIG
+PKG_CONFIG_PATH
+PKG_CONFIG_LIBDIR
+ICU_CFLAGS
+ICU_LIBS
 LDFLAGS_EX
 LDFLAGS_SL
 DOCBOOKSTYLE'
@@ -1509,6 +1521,7 @@ Optional Packages:
   --with-wal-segsize=SEGSIZE
                           set WAL segment size in MB [16]
   --with-CC=CMD           set compiler (deprecated)
+  --with-icu              build with ICU support
   --with-tcl              build Tcl modules (PL/Tcl)
   --with-tclconfig=DIR    tclConfig.sh is in DIR
   --with-perl             build Perl modules (PL/Perl)
@@ -1544,6 +1557,13 @@ Some influential environment variables:
   CPPFLAGS    (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if
               you have headers in a nonstandard directory <include dir>
   CPP         C preprocessor
+  PKG_CONFIG  path to pkg-config utility
+  PKG_CONFIG_PATH
+              directories to add to pkg-config's search path
+  PKG_CONFIG_LIBDIR
+              path overriding pkg-config's built-in search path
+  ICU_CFLAGS  C compiler flags for ICU, overriding pkg-config
+  ICU_LIBS    linker flags for ICU, overriding pkg-config
   LDFLAGS_EX  extra linker flags for linking executables only
   LDFLAGS_SL  extra linker flags for linking shared libraries only
   DOCBOOKSTYLE
@@ -5340,6 +5360,255 @@ $as_echo "$enable_thread_safety" >&6; }
 
 
 #
+# ICU
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with ICU support" >&5
+$as_echo_n "checking whether to build with ICU support... " >&6; }
+
+
+
+# Check whether --with-icu was given.
+if test "${with_icu+set}" = set; then :
+  withval=$with_icu;
+  case $withval in
+    yes)
+
+$as_echo "#define USE_ICU 1" >>confdefs.h
+
+      ;;
+    no)
+      :
+      ;;
+    *)
+      as_fn_error $? "no argument expected for --with-icu option" "$LINENO" 5
+      ;;
+  esac
+
+else
+  with_icu=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_icu" >&5
+$as_echo "$with_icu" >&6; }
+
+
+if test "$with_icu" = yes; then
+
+
+
+
+
+
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	if test -n "$ac_tool_prefix"; then
+  # Extract the first word of "${ac_tool_prefix}pkg-config", so it can be a program name with args.
+set dummy ${ac_tool_prefix}pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_PKG_CONFIG="$PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+PKG_CONFIG=$ac_cv_path_PKG_CONFIG
+if test -n "$PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $PKG_CONFIG" >&5
+$as_echo "$PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+fi
+if test -z "$ac_cv_path_PKG_CONFIG"; then
+  ac_pt_PKG_CONFIG=$PKG_CONFIG
+  # Extract the first word of "pkg-config", so it can be a program name with args.
+set dummy pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_ac_pt_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $ac_pt_PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_ac_pt_PKG_CONFIG="$ac_pt_PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_ac_pt_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+ac_pt_PKG_CONFIG=$ac_cv_path_ac_pt_PKG_CONFIG
+if test -n "$ac_pt_PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_pt_PKG_CONFIG" >&5
+$as_echo "$ac_pt_PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+  if test "x$ac_pt_PKG_CONFIG" = x; then
+    PKG_CONFIG=""
+  else
+    case $cross_compiling:$ac_tool_warned in
+yes:)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5
+$as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;}
+ac_tool_warned=yes ;;
+esac
+    PKG_CONFIG=$ac_pt_PKG_CONFIG
+  fi
+else
+  PKG_CONFIG="$ac_cv_path_PKG_CONFIG"
+fi
+
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=0.9.0
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: checking pkg-config is at least version $_pkg_min_version" >&5
+$as_echo_n "checking pkg-config is at least version $_pkg_min_version... " >&6; }
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+	else
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+		PKG_CONFIG=""
+	fi
+fi
+
+pkg_failed=no
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for ICU" >&5
+$as_echo_n "checking for ICU... " >&6; }
+
+if test -n "$ICU_CFLAGS"; then
+    pkg_cv_ICU_CFLAGS="$ICU_CFLAGS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_CFLAGS=`$PKG_CONFIG --cflags "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+if test -n "$ICU_LIBS"; then
+    pkg_cv_ICU_LIBS="$ICU_LIBS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_LIBS=`$PKG_CONFIG --libs "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+
+
+
+if test $pkg_failed = yes; then
+   	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi
+        if test $_pkg_short_errors_supported = yes; then
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        else
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$ICU_PKG_ERRORS" >&5
+
+	as_fn_error $? "Package requirements (icu-uc icu-i18n) were not met:
+
+$ICU_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details." "$LINENO" 5
+elif test $pkg_failed = untried; then
+     	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+	{ { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error $? "The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.
+See \`config.log' for more details" "$LINENO" 5; }
+else
+	ICU_CFLAGS=$pkg_cv_ICU_CFLAGS
+	ICU_LIBS=$pkg_cv_ICU_LIBS
+        { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+
+fi
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with Tcl" >&5
diff --git a/configure.in b/configure.in
index c878b4e..4ef9b4c 100644
--- a/configure.in
+++ b/configure.in
@@ -609,6 +609,19 @@ AC_MSG_RESULT([$enable_thread_safety])
 AC_SUBST(enable_thread_safety)
 
 #
+# ICU
+#
+AC_MSG_CHECKING([whether to build with ICU support])
+PGAC_ARG_BOOL(with, icu, no, [build with ICU support],
+              [AC_DEFINE([USE_ICU], 1, [Define to build with ICU support. (--with-icu)])])
+AC_MSG_RESULT([$with_icu])
+AC_SUBST(with_icu)
+
+if test "$with_icu" = yes; then
+  PKG_CHECK_MODULES(ICU, icu-uc icu-i18n)
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 AC_MSG_CHECKING([whether to build with Tcl])
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 14a6d57..8e62b11 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -771,6 +771,15 @@ <title>Configuration</title>
       </varlistentry>
 
       <varlistentry>
+       <term><option>--with-icu</option></term>
+       <listitem>
+        <para>
+         Build with support for the <application>ICU</> library.
+        </para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
        <term><option>--with-openssl</option>
        <indexterm>
         <primary>OpenSSL</primary>
diff --git a/doc/src/sgml/ref/create_collation.sgml b/doc/src/sgml/ref/create_collation.sgml
index d757cdf..f37ffba 100644
--- a/doc/src/sgml/ref/create_collation.sgml
+++ b/doc/src/sgml/ref/create_collation.sgml
@@ -21,7 +21,8 @@
 CREATE COLLATION <replaceable>name</replaceable> (
     [ LOCALE = <replaceable>locale</replaceable>, ]
     [ LC_COLLATE = <replaceable>lc_collate</replaceable>, ]
-    [ LC_CTYPE = <replaceable>lc_ctype</replaceable> ]
+    [ LC_CTYPE = <replaceable>lc_ctype</replaceable>, ]
+    [ PROVIDER = <replaceable>provider</replaceable> ]
 )
 CREATE COLLATION <replaceable>name</replaceable> FROM <replaceable>existing_collation</replaceable>
 </synopsis>
@@ -103,6 +104,19 @@ <title>Parameters</title>
     </varlistentry>
 
     <varlistentry>
+     <term><replaceable>provider</replaceable></term>
+
+     <listitem>
+      <para>
+       Specifies the provider to use for locale services associated with this
+       collation.  Possible values
+       are: <literal>icu</literal>, <literal>posix</literal>.  The available
+       choices depend on the operating system and build options.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
      <term><replaceable>existing_collation</replaceable></term>
 
      <listitem>
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index c211a2d..ad50938 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -179,6 +179,7 @@ pgxsdir = $(pkglibdir)/pgxs
 #
 # Records the choice of the various --enable-xxx and --with-xxx options.
 
+with_icu	= @with_icu@
 with_perl	= @with_perl@
 with_python	= @with_python@
 with_tcl	= @with_tcl@
@@ -207,6 +208,9 @@ python_version		= @python_version@
 
 krb_srvtab = @krb_srvtab@
 
+ICU_CFLAGS		= @ICU_CFLAGS@
+ICU_LIBS		= @ICU_LIBS@
+
 TCLSH			= @TCLSH@
 TCL_LIBS		= @TCL_LIBS@
 TCL_LIB_SPEC		= @TCL_LIB_SPEC@
diff --git a/src/backend/Makefile b/src/backend/Makefile
index 3b08def..943b36d 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -58,7 +58,7 @@ ifneq ($(PORTNAME), win32)
 ifneq ($(PORTNAME), aix)
 
 postgres: $(OBJS)
-	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) -o $@
+	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) $(ICU_LIBS) -o $@
 
 endif
 endif
diff --git a/src/backend/catalog/pg_collation.c b/src/backend/catalog/pg_collation.c
index f37cf37..7d4a8b5 100644
--- a/src/backend/catalog/pg_collation.c
+++ b/src/backend/catalog/pg_collation.c
@@ -40,8 +40,10 @@
 Oid
 CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
-				const char *collcollate, const char *collctype)
+				const char *collcollate, const char *collctype,
+				int32 collversion)
 {
 	Relation	rel;
 	TupleDesc	tupDesc;
@@ -75,12 +77,12 @@ CollationCreate(const char *collname, Oid collnamespace,
 		ereport(ERROR,
 				(errcode(ERRCODE_DUPLICATE_OBJECT),
 				 errmsg("collation \"%s\" for encoding \"%s\" already exists",
-						collname, pg_encoding_to_char(collencoding))));
+						collname, pg_encoding_to_char(collencoding)))); // XXX
 
 	/*
 	 * Also forbid matching an any-encoding entry.  This test of course is not
 	 * backed up by the unique index, but it's not a problem since we don't
-	 * support adding any-encoding entries after initdb.
+	 * support adding any-encoding entries after initdb. XXX
 	 */
 	if (SearchSysCacheExists3(COLLNAMEENCNSP,
 							  PointerGetDatum(collname),
@@ -102,11 +104,13 @@ CollationCreate(const char *collname, Oid collnamespace,
 	values[Anum_pg_collation_collname - 1] = NameGetDatum(&name_name);
 	values[Anum_pg_collation_collnamespace - 1] = ObjectIdGetDatum(collnamespace);
 	values[Anum_pg_collation_collowner - 1] = ObjectIdGetDatum(collowner);
+	values[Anum_pg_collation_collprovider - 1] = CharGetDatum(collprovider);
 	values[Anum_pg_collation_collencoding - 1] = Int32GetDatum(collencoding);
 	namestrcpy(&name_collate, collcollate);
 	values[Anum_pg_collation_collcollate - 1] = NameGetDatum(&name_collate);
 	namestrcpy(&name_ctype, collctype);
 	values[Anum_pg_collation_collctype - 1] = NameGetDatum(&name_ctype);
+	values[Anum_pg_collation_collversion - 1] = Int32GetDatum(collversion);
 
 	tup = heap_form_tuple(tupDesc, values, nulls);
 
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index e4ebb65..7bd9479 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -48,8 +48,12 @@ DefineCollation(List *names, List *parameters)
 	DefElem    *localeEl = NULL;
 	DefElem    *lccollateEl = NULL;
 	DefElem    *lcctypeEl = NULL;
+	DefElem    *providerEl = NULL;
 	char	   *collcollate = NULL;
 	char	   *collctype = NULL;
+	char	   *collproviderstr = NULL;
+	char		collprovider;
+	int32		collversion;
 	Oid			newoid;
 	ObjectAddress address;
 
@@ -73,6 +77,8 @@ DefineCollation(List *names, List *parameters)
 			defelp = &lccollateEl;
 		else if (pg_strcasecmp(defel->defname, "lc_ctype") == 0)
 			defelp = &lcctypeEl;
+		else if (pg_strcasecmp(defel->defname, "provider") == 0)
+			defelp = &providerEl;
 		else
 		{
 			ereport(ERROR,
@@ -103,6 +109,7 @@ DefineCollation(List *names, List *parameters)
 
 		collcollate = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collcollate));
 		collctype = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collctype));
+		collprovider = ((Form_pg_collation) GETSTRUCT(tp))->collprovider;
 
 		ReleaseSysCache(tp);
 	}
@@ -119,6 +126,26 @@ DefineCollation(List *names, List *parameters)
 	if (lcctypeEl)
 		collctype = defGetString(lcctypeEl);
 
+	if (providerEl)
+		collproviderstr = defGetString(providerEl);
+
+	if (collproviderstr)
+	{
+		if (pg_strcasecmp(collproviderstr, "posix") == 0)
+			collprovider = COLLPROVIDER_POSIX;
+#ifdef USE_ICU
+		if (pg_strcasecmp(collproviderstr, "icu") == 0)
+			collprovider = COLLPROVIDER_ICU;
+#endif
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+					 errmsg("unrecognized collation provider: %s",
+							collproviderstr)));
+	}
+	else if (!fromEl)
+		collprovider = COLLPROVIDER_POSIX;
+
 	if (!collcollate)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
@@ -131,12 +158,37 @@ DefineCollation(List *names, List *parameters)
 
 	check_encoding_locale_matches(GetDatabaseEncoding(), collcollate, collctype);
 
+#ifdef USE_ICU
+	if (collprovider == COLLPROVIDER_ICU)
+	{
+		UCollator  *collator;
+		UErrorCode	status;
+		UVersionInfo versioninfo;
+
+		status = U_ZERO_ERROR;
+		collator = ucol_open(collcollate, &status);
+		if (U_FAILURE(status))
+			ereport(ERROR,
+					(errmsg("could not open collator for locale \"%s\": %s",
+							collcollate, u_errorName(status))));
+
+		ucol_getVersion(collator, versioninfo);
+		ucol_close(collator);
+
+		collversion = ntohl(*((uint32 *) versioninfo));
+	}
+	else
+#endif
+		collversion = 0;
+
 	newoid = CollationCreate(collName,
 							 collNamespace,
 							 GetUserId(),
+							 collprovider,
 							 GetDatabaseEncoding(),
 							 collcollate,
-							 collctype);
+							 collctype,
+							 collversion);
 
 	ObjectAddressSet(address, CollationRelationId, newoid);
 
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index 551ae7d..7adf017 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -303,13 +303,13 @@ pg_wc_isdigit(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswdigit_l((wint_t) c, pg_regex_locale);
+				return iswdigit_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isdigit_l((unsigned char) c, pg_regex_locale));
+					isdigit_l((unsigned char) c, pg_regex_locale->lt));
 #endif
 			break;
 	}
@@ -336,13 +336,13 @@ pg_wc_isalpha(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalpha_l((wint_t) c, pg_regex_locale);
+				return iswalpha_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalpha_l((unsigned char) c, pg_regex_locale));
+					isalpha_l((unsigned char) c, pg_regex_locale->lt));
 #endif
 			break;
 	}
@@ -369,13 +369,13 @@ pg_wc_isalnum(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalnum_l((wint_t) c, pg_regex_locale);
+				return iswalnum_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalnum_l((unsigned char) c, pg_regex_locale));
+					isalnum_l((unsigned char) c, pg_regex_locale->lt));
 #endif
 			break;
 	}
@@ -402,13 +402,13 @@ pg_wc_isupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswupper_l((wint_t) c, pg_regex_locale);
+				return iswupper_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isupper_l((unsigned char) c, pg_regex_locale));
+					isupper_l((unsigned char) c, pg_regex_locale->lt));
 #endif
 			break;
 	}
@@ -435,13 +435,13 @@ pg_wc_islower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswlower_l((wint_t) c, pg_regex_locale);
+				return iswlower_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					islower_l((unsigned char) c, pg_regex_locale));
+					islower_l((unsigned char) c, pg_regex_locale->lt));
 #endif
 			break;
 	}
@@ -468,13 +468,13 @@ pg_wc_isgraph(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswgraph_l((wint_t) c, pg_regex_locale);
+				return iswgraph_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isgraph_l((unsigned char) c, pg_regex_locale));
+					isgraph_l((unsigned char) c, pg_regex_locale->lt));
 #endif
 			break;
 	}
@@ -501,13 +501,13 @@ pg_wc_isprint(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswprint_l((wint_t) c, pg_regex_locale);
+				return iswprint_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isprint_l((unsigned char) c, pg_regex_locale));
+					isprint_l((unsigned char) c, pg_regex_locale->lt));
 #endif
 			break;
 	}
@@ -534,13 +534,13 @@ pg_wc_ispunct(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswpunct_l((wint_t) c, pg_regex_locale);
+				return iswpunct_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					ispunct_l((unsigned char) c, pg_regex_locale));
+					ispunct_l((unsigned char) c, pg_regex_locale->lt));
 #endif
 			break;
 	}
@@ -567,13 +567,13 @@ pg_wc_isspace(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswspace_l((wint_t) c, pg_regex_locale);
+				return iswspace_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isspace_l((unsigned char) c, pg_regex_locale));
+					isspace_l((unsigned char) c, pg_regex_locale->lt));
 #endif
 			break;
 	}
@@ -608,13 +608,13 @@ pg_wc_toupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towupper_l((wint_t) c, pg_regex_locale);
+				return towupper_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return toupper_l((unsigned char) c, pg_regex_locale);
+				return toupper_l((unsigned char) c, pg_regex_locale->lt);
 #endif
 			return c;
 	}
@@ -649,13 +649,13 @@ pg_wc_tolower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towlower_l((wint_t) c, pg_regex_locale);
+				return towlower_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return tolower_l((unsigned char) c, pg_regex_locale);
+				return tolower_l((unsigned char) c, pg_regex_locale->lt);
 #endif
 			return c;
 	}
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index bbd97dc..be3968b 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -82,6 +82,10 @@
 #include <wctype.h>
 #endif
 
+#ifdef USE_ICU
+#include <unicode/ustring.h>
+#endif
+
 #include "catalog/pg_collation.h"
 #include "mb/pg_wchar.h"
 #include "utils/builtins.h"
@@ -1456,6 +1460,42 @@ str_numth(char *dest, char *num, int type)
  *			upper/lower/initcap functions
  *****************************************************************************/
 
+#ifdef USE_ICU
+static int32_t
+icu_convert_case(int32_t (*func)(UChar *, int32_t, const UChar *, int32_t, const char *, UErrorCode *),
+				 pg_locale_t mylocale, UChar **buff_dest, UChar *buff_source, int32_t len_source)
+{
+	UErrorCode	status;
+	int32_t		len_dest;
+
+	len_dest = len_source;  /* try first with same length */
+	*buff_dest = palloc(len_dest * sizeof(**buff_dest));
+	status = U_ZERO_ERROR;
+	len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->locale, &status);
+	if (status == U_BUFFER_OVERFLOW_ERROR)
+	{
+		/* try again with adjusted length */
+		pfree(buff_dest);
+		buff_dest = palloc(len_dest * sizeof(**buff_dest));
+		status = U_ZERO_ERROR;
+		len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->locale, &status);
+	}
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("case conversion failed: %s", u_errorName(status))));
+	return len_dest;
+}
+
+static int32_t
+u_strToTitle_default_BI(UChar *dest, int32_t destCapacity,
+						const UChar *src, int32_t srcLength,
+						const char *locale,
+						UErrorCode *pErrorCode)
+{
+	return u_strToTitle(dest, destCapacity, src, srcLength, NULL, locale, pErrorCode);
+}
+#endif
+
 /*
  * If the system provides the needed functions for wide-character manipulation
  * (which are all standardized by C99), then we implement upper/lower/initcap
@@ -1492,12 +1532,9 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 		result = asc_tolower(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1515,77 +1552,79 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar;
+			int32_t		len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(mylocale, &buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToLower, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(mylocale, &result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
+
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-			else
+					if (mylocale)
+						workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->lt);
+					else
 #endif
-				workspace[curr_char] = towlower(workspace[curr_char]);
-		}
+						workspace[curr_char] = towlower(workspace[curr_char]);
+				}
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
+			}
 #endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
-
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
+			else
 			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for lower() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that tolower_l() will not be so broken as to need
-		 * an isupper_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that tolower_l() will not be so broken as to need
+				 * an isupper_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = tolower_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = tolower_l((unsigned char) *p, mylocale->lt);
+					else
 #endif
-				*p = pg_tolower((unsigned char) *p);
+						*p = pg_tolower((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1612,12 +1651,9 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 		result = asc_toupper(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1635,77 +1671,78 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+			len_uchar = icu_to_uchar(mylocale, &buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToUpper, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(mylocale, &result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-			else
-#endif
-				workspace[curr_char] = towupper(workspace[curr_char]);
-		}
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
+					if (mylocale)
+						workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->lt);
+					else
 #endif
-		char	   *p;
+						workspace[curr_char] = towupper(workspace[curr_char]);
+				}
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for upper() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
+
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+#endif   /* USE_WIDE_UPPER_LOWER */
+			else
+			{
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that toupper_l() will not be so broken as to need
-		 * an islower_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that toupper_l() will not be so broken as to need
+				 * an islower_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = toupper_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = toupper_l((unsigned char) *p, mylocale->lt);
+					else
 #endif
-				*p = pg_toupper((unsigned char) *p);
+						*p = pg_toupper((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1733,12 +1770,9 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 		result = asc_initcap(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1756,100 +1790,101 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
-
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
-
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
-
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
 		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-				else
-					workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-				wasalnum = iswalnum_l(workspace[curr_char], mylocale);
-			}
-			else
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(mylocale, &buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(mylocale, &result, buff_conv, len_conv);
+		}
+		else
 #endif
+		{
+			if (pg_database_encoding_max_length() > 1)
 			{
-				if (wasalnum)
-					workspace[curr_char] = towlower(workspace[curr_char]);
-				else
-					workspace[curr_char] = towupper(workspace[curr_char]);
-				wasalnum = iswalnum(workspace[curr_char]);
-			}
-		}
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for initcap() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
+					if (mylocale)
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->lt);
+						else
+							workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->lt);
+						wasalnum = iswalnum_l(workspace[curr_char], mylocale->lt);
+					}
+					else
 #endif
-		}
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower(workspace[curr_char]);
+						else
+							workspace[curr_char] = towupper(workspace[curr_char]);
+						wasalnum = iswalnum(workspace[curr_char]);
+					}
+				}
 
-		result = pnstrdup(buff, nbytes);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		/*
-		 * Note: we assume that toupper_l()/tolower_l() will not be so broken
-		 * as to need guard tests.  When using the default collation, we apply
-		 * the traditional Postgres behavior that forces ASCII-style treatment
-		 * of I/i, but in non-default collations you get exactly what the
-		 * collation says.
-		 */
-		for (p = result; *p; p++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					*p = tolower_l((unsigned char) *p, mylocale);
-				else
-					*p = toupper_l((unsigned char) *p, mylocale);
-				wasalnum = isalnum_l((unsigned char) *p, mylocale);
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
+#endif   /* USE_WIDE_UPPER_LOWER */
 			else
-#endif
 			{
-				if (wasalnum)
-					*p = pg_tolower((unsigned char) *p);
-				else
-					*p = pg_toupper((unsigned char) *p);
-				wasalnum = isalnum((unsigned char) *p);
+				char	   *p;
+
+				result = pnstrdup(buff, nbytes);
+
+				/*
+				 * Note: we assume that toupper_l()/tolower_l() will not be so broken
+				 * as to need guard tests.  When using the default collation, we apply
+				 * the traditional Postgres behavior that forces ASCII-style treatment
+				 * of I/i, but in non-default collations you get exactly what the
+				 * collation says.
+				 */
+				for (p = result; *p; p++)
+				{
+#ifdef HAVE_LOCALE_T
+					if (mylocale)
+					{
+						if (wasalnum)
+							*p = tolower_l((unsigned char) *p, mylocale->lt);
+						else
+							*p = toupper_l((unsigned char) *p, mylocale->lt);
+						wasalnum = isalnum_l((unsigned char) *p, mylocale->lt);
+					}
+					else
+#endif
+					{
+						if (wasalnum)
+							*p = pg_tolower((unsigned char) *p);
+						else
+							*p = pg_toupper((unsigned char) *p);
+						wasalnum = isalnum((unsigned char) *p);
+					}
+				}
 			}
 		}
 	}
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index 08b86f5..272a721 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -96,7 +96,7 @@ SB_lower_char(unsigned char c, pg_locale_t locale, bool locale_is_c)
 		return pg_ascii_tolower(c);
 #ifdef HAVE_LOCALE_T
 	else if (locale)
-		return tolower_l(c, locale);
+		return tolower_l(c, locale->lt);
 #endif
 	else
 		return pg_tolower(c);
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index a818023..5d0e39d 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -1213,7 +1213,60 @@ pg_newlocale_from_collation(Oid collid)
 #endif
 		}
 
-		cache_entry->locale = result;
+		cache_entry->locale = malloc(sizeof(* cache_entry->locale));
+		cache_entry->locale->provider = collform->collprovider;
+		cache_entry->locale->lt = result;
+
+		if (collform->collprovider == COLLPROVIDER_ICU)
+		{
+#ifdef USE_ICU
+			const char *collcollate;
+			UCollator  *collator;
+			const char *encstring;
+			UConverter *converter;
+			UErrorCode	status;
+			UVersionInfo versioninfo;
+			int32		numversion;
+
+			collcollate = NameStr(collform->collcollate);
+
+			status = U_ZERO_ERROR;
+			collator = ucol_open(collcollate, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not open collator for locale \"%s\": %s",
+								collcollate, u_errorName(status))));
+
+			encstring = (&pg_enc2name_tbl[GetDatabaseEncoding()])->name;
+
+			status = U_ZERO_ERROR;
+			converter = ucnv_open(encstring, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not open converter for encoding \"%s\": %s",
+								encstring, u_errorName(status))));
+
+			ucol_getVersion(collator, versioninfo);
+			numversion = ntohl(*((uint32 *) versioninfo));
+
+			if (numversion != collform->collversion)
+				ereport(WARNING,
+						(errmsg("ICU collator version mismatch"),
+						 errdetail("The database was created using version 0x%08X, the library provides version 0x%08X.",
+								   (uint32) collform->collversion, (uint32) numversion),
+						 errhint("Rebuild affected indexes, or build PostgreSQL with the right version of ICU.")));
+
+			cache_entry->locale->ucol = collator;
+			cache_entry->locale->locale = strdup(collcollate);
+			cache_entry->locale->ucnv = converter;
+#else /* not USE_ICU */
+			/* could get here if a collation was created by a build with ICU */
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("ICU is not supported in this build"), \
+					 errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif /* not USE_ICU */
+		}
 
 		ReleaseSysCache(tp);
 #else							/* not HAVE_LOCALE_T */
@@ -1232,6 +1285,40 @@ pg_newlocale_from_collation(Oid collid)
 }
 
 
+#ifdef USE_ICU
+int32_t
+icu_to_uchar(pg_locale_t mylocale, UChar **buff_uchar, const char *buff, size_t nbytes)
+{
+	UErrorCode	status;
+	int32_t		len_uchar;
+
+	len_uchar = 2 * nbytes;  /* max length per docs */
+	*buff_uchar = palloc(len_uchar * sizeof(**buff_uchar));
+	status = U_ZERO_ERROR;
+	len_uchar = ucnv_toUChars(mylocale->ucnv, *buff_uchar, len_uchar, buff, nbytes, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_toUChars failed: %s", u_errorName(status))));
+	return len_uchar;
+}
+
+int32_t
+icu_from_uchar(pg_locale_t mylocale, char **result, UChar *buff_uchar, int32_t len_uchar)
+{
+	UErrorCode	status;
+	int32_t		len_result;
+
+	len_result = UCNV_GET_MAX_BYTES_FOR_STRING(len_uchar, ucnv_getMaxCharSize(mylocale->ucnv));
+	*result = palloc(len_result + 1);
+	status = U_ZERO_ERROR;
+	ucnv_fromUChars(mylocale->ucnv, *result, len_result, buff_uchar, len_uchar, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_fromUChars failed: %s", u_errorName(status))));
+	return len_result;
+}
+#endif
+
 /*
  * These functions convert from/to libc's wchar_t, *not* pg_wchar_t.
  * Therefore we keep them here rather than with the mbutils code.
@@ -1287,7 +1374,7 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_WCSTOMBS_L
 		/* Use wcstombs_l for nondefault locales */
-		result = wcstombs_l(to, from, tolen, locale);
+		result = wcstombs_l(to, from, tolen, locale->lt);
 #else							/* !HAVE_WCSTOMBS_L */
 		/* We have to temporarily set the locale as current ... ugh */
 		locale_t	save_locale = uselocale(locale);
@@ -1362,7 +1449,7 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_MBSTOWCS_L
 			/* Use mbstowcs_l for nondefault locales */
-			result = mbstowcs_l(to, str, tolen, locale);
+			result = mbstowcs_l(to, str, tolen, locale->lt);
 #else							/* !HAVE_MBSTOWCS_L */
 			/* We have to temporarily set the locale as current ... ugh */
 			locale_t	save_locale = uselocale(locale);
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 56943f2..662956d 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5274,7 +5274,7 @@ pattern_char_isalpha(char c, bool is_multibyte,
 		return true;
 #ifdef HAVE_LOCALE_T
 	else if (locale)
-		return isalpha_l((unsigned char) c, locale);
+		return isalpha_l((unsigned char) c, locale->lt);
 #endif
 	else
 		return isalpha((unsigned char) c);
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index bf7c0cd..af6f762 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1402,10 +1402,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		char		a2buf[TEXTBUFLEN];
 		char	   *a1p,
 				   *a2p;
-
-#ifdef HAVE_LOCALE_T
 		pg_locale_t mylocale = 0;
-#endif
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1420,9 +1417,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			mylocale = pg_newlocale_from_collation(collid);
-#endif
 		}
 
 		/*
@@ -1541,9 +1536,42 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		memcpy(a2p, arg2, len2);
 		a2p[len2] = '\0';
 
-#ifdef HAVE_LOCALE_T
 		if (mylocale)
-			result = strcoll_l(a1p, a2p, mylocale);
+		{
+#ifdef USE_ICU
+			if (mylocale->provider == COLLPROVIDER_ICU)
+			{
+				if (GetDatabaseEncoding() == PG_UTF8)
+				{
+					UErrorCode	status;
+
+					status = U_ZERO_ERROR;
+					result = ucol_strcollUTF8(mylocale->ucol,
+											  arg1, len1,
+											  arg2, len2,
+											  &status);
+					if (U_FAILURE(status))
+						ereport(ERROR,
+								(errmsg("collation failed: %s", u_errorName(status))));
+				}
+				else
+				{
+					int32_t ulen1, ulen2;
+					UChar *uchar1, *uchar2;
+
+					ulen1 = icu_to_uchar(mylocale, &uchar1, arg1, len1);
+					ulen2 = icu_to_uchar(mylocale, &uchar2, arg2, len2);
+
+					result = ucol_strcoll(mylocale->ucol,
+										  uchar1, ulen1,
+										  uchar2, ulen2);
+				}
+			}
+			else
+#endif
+#ifdef HAVE_LOCALE_T
+				result = strcoll_l(a1p, a2p, mylocale->lt);
+		}
 		else
 #endif
 			result = strcoll(a1p, a2p);
@@ -2089,9 +2117,42 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 		goto done;
 	}
 
-#ifdef HAVE_LOCALE_T
 	if (sss->locale)
-		result = strcoll_l(sss->buf1, sss->buf2, sss->locale);
+	{
+#ifdef USE_ICU
+		if (sss->locale->provider == COLLPROVIDER_ICU)
+		{
+			if (GetDatabaseEncoding() == PG_UTF8)
+			{
+				UErrorCode	status;
+
+				status = U_ZERO_ERROR;
+				result = ucol_strcollUTF8(sss->locale->ucol,
+										  a1p, len1,
+										  a2p, len2,
+										  &status);
+				if (U_FAILURE(status))
+					ereport(ERROR,
+							(errmsg("collation failed: %s", u_errorName(status))));
+			}
+			else
+			{
+				int32_t ulen1, ulen2;
+				UChar *uchar1, *uchar2;
+
+				ulen1 = icu_to_uchar(sss->locale, &uchar1, a1p, len1);
+				ulen2 = icu_to_uchar(sss->locale, &uchar2, a2p, len2);
+
+				result = ucol_strcoll(sss->locale->ucol,
+									  uchar1, ulen1,
+									  uchar2, ulen2);
+			}
+		}
+		else
+#endif
+#ifdef HAVE_LOCALE_T
+			result = strcoll_l(sss->buf1, sss->buf2, sss->locale->lt);
+	}
 	else
 #endif
 		result = strcoll(sss->buf1, sss->buf2);
@@ -2231,7 +2292,7 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 #ifdef HAVE_LOCALE_T
 			if (sss->locale)
 				bsize = strxfrm_l(sss->buf2, sss->buf1,
-								  sss->buflen2, sss->locale);
+								  sss->buflen2, sss->locale->lt);
 			else
 #endif
 				bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
diff --git a/src/bin/initdb/Makefile b/src/bin/initdb/Makefile
index 394eae0..404fd33 100644
--- a/src/bin/initdb/Makefile
+++ b/src/bin/initdb/Makefile
@@ -16,7 +16,7 @@ subdir = src/bin/initdb
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) -I$(top_srcdir)/src/timezone $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) -I$(top_srcdir)/src/timezone $(CPPFLAGS) $(ICU_CFLAGS)
 
 # note: we need libpq only because fe_utils does
 LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
@@ -31,7 +31,7 @@ OBJS=	initdb.o findtimezone.o localtime.o encnames.o $(WIN32RES)
 all: initdb
 
 initdb: $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils
-	$(CC) $(CFLAGS) $(OBJS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
+	$(CC) $(CFLAGS) $(OBJS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) $(ICU_LIBS) -o $@$(X)
 
 # We used to pull in all of libpq to get encnames.c, but that
 # exposes us to risks of version skew if we link to a shared library.
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index aad6ba5..6a3712f 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -69,6 +69,11 @@
 #include "miscadmin.h"
 #include "fe_utils/string_utils.h"
 
+#ifdef USE_ICU
+#include <unicode/ucol.h>
+#include <unicode/uloc.h>
+#endif
+
 
 /* Define PG_FLUSH_DATA_WORKS if we have an implementation for pg_flush_data */
 #if defined(HAVE_SYNC_FILE_RANGE)
@@ -332,6 +337,12 @@ do { \
 		output_failed = true, output_errno = errno; \
 } while (0)
 
+#define PG_CMD_PRINTF4(fmt, arg1, arg2, arg3, arg4)		\
+do { \
+	if (fprintf(cmdfd, fmt, arg1, arg2, arg3, arg4) < 0 || fflush(cmdfd) < 0) \
+		output_failed = true, output_errno = errno; \
+} while (0)
+
 static char *
 escape_quotes(const char *src)
 {
@@ -1921,12 +1932,12 @@ setup_collation(FILE *cmdfd)
 	 * Also, eliminate any aliases that conflict with pg_collation's
 	 * hard-wired entries for "C" etc.
 	 */
-	PG_CMD_PUTS("INSERT INTO pg_collation (collname, collnamespace, collowner, collencoding, collcollate, collctype) "
+	PG_CMD_PUTS("INSERT INTO pg_collation (collname, collnamespace, collowner, collprovider, collencoding, collcollate, collctype, collversion) "
 				" SELECT DISTINCT ON (collname, encoding)"
 				"   collname, "
 				"   (SELECT oid FROM pg_namespace WHERE nspname = 'pg_catalog') AS collnamespace, "
 				"   (SELECT relowner FROM pg_class WHERE relname = 'pg_collation') AS collowner, "
-				"   encoding, locale, locale "
+				"   'p', encoding, locale, locale, 0 "
 				"  FROM tmp_pg_collation"
 				"  WHERE NOT EXISTS (SELECT 1 FROM pg_collation WHERE collname = tmp_pg_collation.collname)"
 	 "  ORDER BY collname, encoding, (collname = locale) DESC, locale;\n\n");
@@ -1945,6 +1956,41 @@ setup_collation(FILE *cmdfd)
 		printf(_("Use the option \"--debug\" to see details.\n"));
 	}
 #endif   /* not HAVE_LOCALE_T  && not WIN32 */
+
+#ifdef USE_ICU
+	{
+		int i;
+
+		for (i = 0; i < uloc_countAvailable(); i++)
+		{
+			const char *name = uloc_getAvailable(i);
+			UCollator  *collator;
+			UErrorCode	status;
+			UVersionInfo versioninfo;
+			int32		collversion;
+
+			status = U_ZERO_ERROR;
+			collator = ucol_open(name, &status);
+			if (U_FAILURE(status))
+			{
+				fprintf(stderr, "%s: could not open collator for locale \"%s\": %s",
+						progname, name, u_errorName(status));
+				exit_nicely();
+			}
+			ucol_getVersion(collator, versioninfo);
+			ucol_close(collator);
+
+			collversion = ntohl(*((uint32 *) versioninfo));
+
+			PG_CMD_PRINTF4("INSERT INTO pg_collation (collname, collnamespace, collowner, collprovider, collencoding, collcollate, collctype, collversion) "
+						   "VALUES ('%s%%icu', "
+						   "(SELECT oid FROM pg_namespace WHERE nspname = 'pg_catalog'), "
+						   "(SELECT relowner FROM pg_class WHERE relname = 'pg_collation'), "
+						   "'i', -1, '%s', '%s', %d);",
+						   name, name, name, collversion);
+		}
+	}
+#endif
 }
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a5c2d09..644c5c0 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -13176,8 +13176,10 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	PQExpBuffer delq;
 	PQExpBuffer labelq;
 	PGresult   *res;
+	int			i_collprovider;
 	int			i_collcollate;
 	int			i_collctype;
+	const char *collprovider;
 	const char *collcollate;
 	const char *collctype;
 
@@ -13194,18 +13196,30 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	selectSourceSchema(fout, collinfo->dobj.namespace->dobj.name);
 
 	/* Get collation-specific details */
-	appendPQExpBuffer(query, "SELECT "
-					  "collcollate, "
-					  "collctype "
-					  "FROM pg_catalog.pg_collation c "
-					  "WHERE c.oid = '%u'::pg_catalog.oid",
-					  collinfo->dobj.catId.oid);
+	if (fout->remoteVersion >= 100000)
+		appendPQExpBuffer(query, "SELECT "
+						  "collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
+	else
+		appendPQExpBuffer(query, "SELECT "
+						  "'p'::char AS collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
 
 	res = ExecuteSqlQueryForSingleRow(fout, query->data);
 
+	i_collprovider = PQfnumber(res, "collprovider");
 	i_collcollate = PQfnumber(res, "collcollate");
 	i_collctype = PQfnumber(res, "collctype");
 
+	collprovider = PQgetvalue(res, 0, i_collprovider);
 	collcollate = PQgetvalue(res, 0, i_collcollate);
 	collctype = PQgetvalue(res, 0, i_collctype);
 
@@ -13217,11 +13231,32 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	appendPQExpBuffer(delq, ".%s;\n",
 					  fmtId(collinfo->dobj.name));
 
-	appendPQExpBuffer(q, "CREATE COLLATION %s (lc_collate = ",
+	appendPQExpBuffer(q, "CREATE COLLATION %s (",
 					  fmtId(collinfo->dobj.name));
-	appendStringLiteralAH(q, collcollate, fout);
-	appendPQExpBufferStr(q, ", lc_ctype = ");
-	appendStringLiteralAH(q, collctype, fout);
+
+	appendPQExpBufferStr(q, "provider = ");
+	if (collprovider[0] == 'i')
+		appendStringLiteralAH(q, "icu", fout);
+	else if (collprovider[0] == 'p')
+		appendStringLiteralAH(q, "posix", fout);
+	else
+		exit_horribly(NULL,
+					  "unrecognized collation provider: %s\n",
+					  collprovider);
+
+	if (strcmp(collcollate, collctype) == 0)
+	{
+		appendPQExpBufferStr(q, ", locale = ");
+		appendStringLiteralAH(q, collcollate, fout);
+	}
+	else
+	{
+		appendPQExpBufferStr(q, ", lc_collate = ");
+		appendStringLiteralAH(q, collcollate, fout);
+		appendPQExpBufferStr(q, ", lc_ctype = ");
+		appendStringLiteralAH(q, collctype, fout);
+	}
+
 	appendPQExpBufferStr(q, ");\n");
 
 	appendPQExpBuffer(labelq, "COLLATION %s", fmtId(collinfo->dobj.name));
diff --git a/src/include/catalog/pg_collation.h b/src/include/catalog/pg_collation.h
index 6367712..a1c65f6 100644
--- a/src/include/catalog/pg_collation.h
+++ b/src/include/catalog/pg_collation.h
@@ -34,9 +34,11 @@ CATALOG(pg_collation,3456)
 	NameData	collname;		/* collation name */
 	Oid			collnamespace;	/* OID of namespace containing collation */
 	Oid			collowner;		/* owner of collation */
+	char		collprovider;	/* see constants below */
 	int32		collencoding;	/* encoding for this collation; -1 = "all" */
 	NameData	collcollate;	/* LC_COLLATE setting */
 	NameData	collctype;		/* LC_CTYPE setting */
+	int32		collversion;	/* provider-dependent version of collation data */
 } FormData_pg_collation;
 
 /* ----------------
@@ -50,27 +52,34 @@ typedef FormData_pg_collation *Form_pg_collation;
  *		compiler constants for pg_collation
  * ----------------
  */
-#define Natts_pg_collation				6
+#define Natts_pg_collation				8
 #define Anum_pg_collation_collname		1
 #define Anum_pg_collation_collnamespace 2
 #define Anum_pg_collation_collowner		3
-#define Anum_pg_collation_collencoding	4
-#define Anum_pg_collation_collcollate	5
-#define Anum_pg_collation_collctype		6
+#define Anum_pg_collation_collprovider	4
+#define Anum_pg_collation_collencoding	5
+#define Anum_pg_collation_collcollate	6
+#define Anum_pg_collation_collctype		7
+#define Anum_pg_collation_collversion	8
 
 /* ----------------
  *		initial contents of pg_collation
  * ----------------
  */
 
-DATA(insert OID = 100 ( default		PGNSP PGUID -1 "" "" ));
+DATA(insert OID = 100 ( default		PGNSP PGUID d -1 "" "" 0 ));
 DESCR("database's default collation");
 #define DEFAULT_COLLATION_OID	100
-DATA(insert OID = 950 ( C			PGNSP PGUID -1 "C" "C" ));
+DATA(insert OID = 950 ( C			PGNSP PGUID p -1 "C" "C" 0 ));
 DESCR("standard C collation");
 #define C_COLLATION_OID			950
-DATA(insert OID = 951 ( POSIX		PGNSP PGUID -1 "POSIX" "POSIX" ));
+DATA(insert OID = 951 ( POSIX		PGNSP PGUID p -1 "POSIX" "POSIX" 0 ));
 DESCR("standard POSIX collation");
 #define POSIX_COLLATION_OID		951
 
+
+#define COLLPROVIDER_DEFAULT	'd'
+#define COLLPROVIDER_ICU		'i'
+#define COLLPROVIDER_POSIX		'p'
+
 #endif   /* PG_COLLATION_H */
diff --git a/src/include/catalog/pg_collation_fn.h b/src/include/catalog/pg_collation_fn.h
index 574b288..bda83b9 100644
--- a/src/include/catalog/pg_collation_fn.h
+++ b/src/include/catalog/pg_collation_fn.h
@@ -16,8 +16,10 @@
 
 extern Oid CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+						   char collprovider,
 				int32 collencoding,
-				const char *collcollate, const char *collctype);
+						   const char *collcollate, const char *collctype,
+						   int32 collversion);
 extern void RemoveCollationById(Oid collationOid);
 
 #endif   /* PG_COLLATION_FN_H */
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index b621ff2..e379058 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -804,6 +804,9 @@
    (--enable-float8-byval) */
 #undef USE_FLOAT8_BYVAL
 
+/* Define to build with ICU support. (--with-icu) */
+#undef USE_ICU
+
 /* Define to 1 if you want 64-bit integer timestamp and interval support.
    (--enable-integer-datetimes) */
 #undef USE_INTEGER_DATETIMES
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 0a4b9f7..bb48119 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -16,6 +16,10 @@
 #if defined(LOCALE_T_IN_XLOCALE) || defined(WCSTOMBS_L_IN_XLOCALE)
 #include <xlocale.h>
 #endif
+#ifdef USE_ICU
+#include <unicode/ucnv.h>
+#include <unicode/ucol.h>
+#endif
 
 #include "utils/guc.h"
 
@@ -62,17 +66,30 @@ extern void cache_locale_time(void);
  * We define our own wrapper around locale_t so we can keep the same
  * function signatures for all builds, while not having to create a
  * fake version of the standard type locale_t in the global namespace.
- * The fake version of pg_locale_t can be checked for truth; that's
- * about all it will be needed for.
+ * pg_locale_t is occasionally checked for truth, so make it a pointer.
  */
+struct pg_locale_t
+{
+	char	provider;
 #ifdef HAVE_LOCALE_T
-typedef locale_t pg_locale_t;
-#else
-typedef int pg_locale_t;
+	locale_t lt;
+#endif
+#ifdef USE_ICU
+	const char *locale;
+	UCollator *ucol;
+	UConverter *ucnv;
 #endif
+};
+
+typedef struct pg_locale_t *pg_locale_t;
 
 extern pg_locale_t pg_newlocale_from_collation(Oid collid);
 
+#ifdef USE_ICU
+extern int32_t icu_to_uchar(pg_locale_t mylocale, UChar **buff_uchar, const char *buff, size_t nbytes);
+extern int32_t icu_from_uchar(pg_locale_t mylocale, char **result, UChar *buff_uchar, int32_t len_uchar);
+#endif
+
 /* These functions convert from/to libc's wchar_t, *not* pg_wchar_t */
 #ifdef USE_WIDE_UPPER_LOWER
 extern size_t wchar2char(char *to, const wchar_t *from, size_t tolen,
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index 6a275cb..52f095b 100644
--- a/src/test/regress/GNUmakefile
+++ b/src/test/regress/GNUmakefile
@@ -123,6 +123,9 @@ tablespace-setup:
 ##
 
 REGRESS_OPTS = --dlpath=. $(EXTRA_REGRESS_OPTS)
+ifeq ($(with_icu),yes)
+EXTRA_TESTS += collate.icu
+endif
 
 check: all tablespace-setup
 	$(pg_regress_check) $(REGRESS_OPTS) --schedule=$(srcdir)/parallel_schedule $(MAXCONNOPT) $(EXTRA_TESTS)
diff --git a/src/test/regress/expected/collate.icu.out b/src/test/regress/expected/collate.icu.out
new file mode 100644
index 0000000..761bd4b
--- /dev/null
+++ b/src/test/regress/expected/collate.icu.out
@@ -0,0 +1,1076 @@
+/*
+ * This test is for Linux/glibc systems and assumes that a full set of
+ * locales is installed.  It must be run in a database with UTF-8 encoding,
+ * because other encodings don't support all the characters used.
+ */
+SET client_encoding TO UTF8;
+CREATE TABLE collate_test1 (
+    a int,
+    b text COLLATE "en_US%icu" NOT NULL
+);
+\d collate_test1
+         Table "public.collate_test1"
+ Column |  Type   |         Modifiers          
+--------+---------+----------------------------
+ a      | integer | 
+ b      | text    | collate en_US%icu not null
+
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "ja_JP.eucjp%icu"
+);
+ERROR:  collation "ja_JP.eucjp%icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "ja_JP.eucjp%icu"
+                   ^
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "foo%icu"
+);
+ERROR:  collation "foo%icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "foo%icu"
+                   ^
+CREATE TABLE collate_test_fail (
+    a int COLLATE "en_US%icu",
+    b text
+);
+ERROR:  collations are not supported by type integer
+LINE 2:     a int COLLATE "en_US%icu",
+                  ^
+CREATE TABLE collate_test_like (
+    LIKE collate_test1
+);
+\d collate_test_like
+       Table "public.collate_test_like"
+ Column |  Type   |         Modifiers          
+--------+---------+----------------------------
+ a      | integer | 
+ b      | text    | collate en_US%icu not null
+
+CREATE TABLE collate_test2 (
+    a int,
+    b text COLLATE "sv_SE%icu"
+);
+CREATE TABLE collate_test3 (
+    a int,
+    b text COLLATE "C"
+);
+INSERT INTO collate_test1 VALUES (1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');
+INSERT INTO collate_test2 SELECT * FROM collate_test1;
+INSERT INTO collate_test3 SELECT * FROM collate_test1;
+SELECT * FROM collate_test1 WHERE b >= 'bbc';
+ a |  b  
+---+-----
+ 3 | bbc
+(1 row)
+
+SELECT * FROM collate_test2 WHERE b >= 'bbc';
+ a |  b  
+---+-----
+ 2 | äbc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test3 WHERE b >= 'bbc';
+ a |  b  
+---+-----
+ 2 | äbc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test3 WHERE b >= 'BBC';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+(3 rows)
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+ a |  b  
+---+-----
+ 2 | äbc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b >= 'bbc' COLLATE "C";
+ a |  b  
+---+-----
+ 2 | äbc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
+ a |  b  
+---+-----
+ 2 | äbc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US%icu";
+ERROR:  collation mismatch between explicit collations "C" and "en_US%icu"
+LINE 1: ...* FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "e...
+                                                             ^
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE%icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE%icu"; -- fails
+ERROR:  collations are not supported by type integer
+CREATE TABLE collate_test4 (
+    a int,
+    b testdomain_sv
+);
+INSERT INTO collate_test4 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test4 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+CREATE TABLE collate_test5 (
+    a int,
+    b testdomain_sv COLLATE "en_US%icu"
+);
+INSERT INTO collate_test5 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test5 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, b FROM collate_test1 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, b FROM collate_test2 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b FROM collate_test3 ORDER BY b;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+-- star expansion
+SELECT * FROM collate_test1 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT * FROM collate_test2 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT * FROM collate_test3 ORDER BY b;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+-- constant expression folding
+SELECT 'bbc' COLLATE "en_US%icu" > 'äbc' COLLATE "en_US%icu" AS "true";
+ true 
+------
+ t
+(1 row)
+
+SELECT 'bbc' COLLATE "sv_SE%icu" > 'äbc' COLLATE "sv_SE%icu" AS "false";
+ false 
+-------
+ f
+(1 row)
+
+-- upper/lower
+CREATE TABLE collate_test10 (
+    a int,
+    x text COLLATE "en_US%icu",
+    y text COLLATE "tr_TR%icu"
+);
+INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
+SELECT a, lower(x), lower(y), upper(x), upper(y), initcap(x), initcap(y) FROM collate_test10;
+ a | lower | lower | upper | upper | initcap | initcap 
+---+-------+-------+-------+-------+---------+---------
+ 1 | hij   | hij   | HIJ   | HİJ   | Hij     | Hij
+ 2 | hij   | hıj   | HIJ   | HIJ   | Hij     | Hıj
+(2 rows)
+
+SELECT a, lower(x COLLATE "C"), lower(y COLLATE "C") FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hij
+(2 rows)
+
+SELECT a, x, y FROM collate_test10 ORDER BY lower(y), a;
+ a |  x  |  y  
+---+-----+-----
+ 2 | HIJ | HIJ
+ 1 | hij | hij
+(2 rows)
+
+-- LIKE/ILIKE
+SELECT * FROM collate_test1 WHERE b LIKE 'abc';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b LIKE 'abc%';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b LIKE '%bc%';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+(3 rows)
+
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+ 4 | ABC
+(4 rows)
+
+SELECT 'Türkiye' COLLATE "en_US%icu" ILIKE '%KI%' AS "true";
+ true 
+------
+ t
+(1 row)
+
+SELECT 'Türkiye' COLLATE "tr_TR%icu" ILIKE '%KI%' AS "false";
+ false 
+-------
+ f
+(1 row)
+
+SELECT 'bıt' ILIKE 'BIT' COLLATE "en_US%icu" AS "false";
+ false 
+-------
+ f
+(1 row)
+
+SELECT 'bıt' ILIKE 'BIT' COLLATE "tr_TR%icu" AS "true";
+ true 
+------
+ t
+(1 row)
+
+-- The following actually exercises the selectivity estimation for ILIKE.
+SELECT relname FROM pg_class WHERE relname ILIKE 'abc%';
+ relname 
+---------
+(0 rows)
+
+-- regular expressions
+SELECT * FROM collate_test1 WHERE b ~ '^abc$';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b ~ '^abc';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b ~ 'bc';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+(3 rows)
+
+SELECT * FROM collate_test1 WHERE b ~* '^abc$';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ~* '^abc';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ~* 'bc';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+ 4 | ABC
+(4 rows)
+
+SELECT 'Türkiye' COLLATE "en_US%icu" ~* 'KI' AS "true";
+ true 
+------
+ t
+(1 row)
+
+SELECT 'Türkiye' COLLATE "tr_TR%icu" ~* 'KI' AS "false";
+ false 
+-------
+ f
+(1 row)
+
+SELECT 'bıt' ~* 'BIT' COLLATE "en_US%icu" AS "false";
+ false 
+-------
+ f
+(1 row)
+
+SELECT 'bıt' ~* 'BIT' COLLATE "tr_TR%icu" AS "true";
+ true 
+------
+ t
+(1 row)
+
+-- The following actually exercises the selectivity estimation for ~*.
+SELECT relname FROM pg_class WHERE relname ~* '^abc';
+ relname 
+---------
+(0 rows)
+
+-- to_char
+SET lc_time TO 'tr_TR';
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
+   to_char   
+-------------
+ 01 NIS 2010
+(1 row)
+
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR%icu");
+   to_char   
+-------------
+ 01 NİS 2010
+(1 row)
+
+-- backwards parsing
+CREATE VIEW collview1 AS SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+CREATE VIEW collview2 AS SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+CREATE VIEW collview3 AS SELECT a, lower((x || x) COLLATE "C") FROM collate_test10;
+SELECT table_name, view_definition FROM information_schema.views
+  WHERE table_name LIKE 'collview%' ORDER BY 1;
+ table_name |                             view_definition                              
+------------+--------------------------------------------------------------------------
+ collview1  |  SELECT collate_test1.a,                                                +
+            |     collate_test1.b                                                     +
+            |    FROM collate_test1                                                   +
+            |   WHERE ((collate_test1.b COLLATE "C") >= 'bbc'::text);
+ collview2  |  SELECT collate_test1.a,                                                +
+            |     collate_test1.b                                                     +
+            |    FROM collate_test1                                                   +
+            |   ORDER BY (collate_test1.b COLLATE "C");
+ collview3  |  SELECT collate_test10.a,                                               +
+            |     lower(((collate_test10.x || collate_test10.x) COLLATE "C")) AS lower+
+            |    FROM collate_test10;
+(3 rows)
+
+-- collation propagation in various expression types
+SELECT a, coalesce(b, 'foo') FROM collate_test1 ORDER BY 2;
+ a | coalesce 
+---+----------
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, coalesce(b, 'foo') FROM collate_test2 ORDER BY 2;
+ a | coalesce 
+---+----------
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, coalesce(b, 'foo') FROM collate_test3 ORDER BY 2;
+ a | coalesce 
+---+----------
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, lower(coalesce(x, 'foo')), lower(coalesce(y, 'foo')) FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hıj
+(2 rows)
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;
+ a |  b  | greatest 
+---+-----+----------
+ 1 | abc | CCC
+ 2 | äbc | CCC
+ 3 | bbc | CCC
+ 4 | ABC | CCC
+(4 rows)
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test2 ORDER BY 3;
+ a |  b  | greatest 
+---+-----+----------
+ 1 | abc | CCC
+ 3 | bbc | CCC
+ 4 | ABC | CCC
+ 2 | äbc | äbc
+(4 rows)
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test3 ORDER BY 3;
+ a |  b  | greatest 
+---+-----+----------
+ 4 | ABC | CCC
+ 1 | abc | abc
+ 3 | bbc | bbc
+ 2 | äbc | äbc
+(4 rows)
+
+SELECT a, x, y, lower(greatest(x, 'foo')), lower(greatest(y, 'foo')) FROM collate_test10;
+ a |  x  |  y  | lower | lower 
+---+-----+-----+-------+-------
+ 1 | hij | hij | hij   | hij
+ 2 | HIJ | HIJ | hij   | hıj
+(2 rows)
+
+SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;
+ a | nullif 
+---+--------
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+ 1 | 
+(4 rows)
+
+SELECT a, nullif(b, 'abc') FROM collate_test2 ORDER BY 2;
+ a | nullif 
+---+--------
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+ 1 | 
+(4 rows)
+
+SELECT a, nullif(b, 'abc') FROM collate_test3 ORDER BY 2;
+ a | nullif 
+---+--------
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+ 1 | 
+(4 rows)
+
+SELECT a, lower(nullif(x, 'foo')), lower(nullif(y, 'foo')) FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hıj
+(2 rows)
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test1 ORDER BY 2;
+ a |  b   
+---+------
+ 4 | ABC
+ 2 | äbc
+ 1 | abcd
+ 3 | bbc
+(4 rows)
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test2 ORDER BY 2;
+ a |  b   
+---+------
+ 4 | ABC
+ 1 | abcd
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test3 ORDER BY 2;
+ a |  b   
+---+------
+ 4 | ABC
+ 1 | abcd
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+CREATE DOMAIN testdomain AS text;
+SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, b::testdomain FROM collate_test2 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b::testdomain FROM collate_test3 ORDER BY 2;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b::testdomain_sv FROM collate_test3 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, lower(x::testdomain), lower(y::testdomain) FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hıj
+(2 rows)
+
+SELECT min(b), max(b) FROM collate_test1;
+ min | max 
+-----+-----
+ abc | bbc
+(1 row)
+
+SELECT min(b), max(b) FROM collate_test2;
+ min | max 
+-----+-----
+ abc | äbc
+(1 row)
+
+SELECT min(b), max(b) FROM collate_test3;
+ min | max 
+-----+-----
+ ABC | äbc
+(1 row)
+
+SELECT array_agg(b ORDER BY b) FROM collate_test1;
+     array_agg     
+-------------------
+ {abc,ABC,äbc,bbc}
+(1 row)
+
+SELECT array_agg(b ORDER BY b) FROM collate_test2;
+     array_agg     
+-------------------
+ {abc,ABC,bbc,äbc}
+(1 row)
+
+SELECT array_agg(b ORDER BY b) FROM collate_test3;
+     array_agg     
+-------------------
+ {ABC,abc,bbc,äbc}
+(1 row)
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test1 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 1 | abc
+ 4 | ABC
+ 4 | ABC
+ 2 | äbc
+ 2 | äbc
+ 3 | bbc
+ 3 | bbc
+(8 rows)
+
+SELECT a, b FROM collate_test2 UNION SELECT a, b FROM collate_test2 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b FROM collate_test3 WHERE a < 4 INTERSECT SELECT a, b FROM collate_test3 WHERE a > 1 ORDER BY 2;
+ a |  b  
+---+-----
+ 3 | bbc
+ 2 | äbc
+(2 rows)
+
+SELECT a, b FROM collate_test3 EXCEPT SELECT a, b FROM collate_test3 WHERE a < 2 ORDER BY 2;
+ a |  b  
+---+-----
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(3 rows)
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  could not determine which collation to use for string comparison
+HINT:  Use the COLLATE clause to set the collation explicitly.
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- ok
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+ 4 | ABC
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+ 4 | ABC
+(8 rows)
+
+SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "C"
+LINE 1: SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collat...
+                                                       ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+SELECT a, b COLLATE "C" FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- ok
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "C"
+LINE 1: ...ELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM col...
+                                                             ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "C"
+LINE 1: SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM colla...
+                                                        ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- fail
+ERROR:  no collation was derived for column "b" with collatable type text
+HINT:  Use the COLLATE clause to set the collation explicitly.
+-- ideally this would be a parse-time error, but for now it must be run-time:
+select x < y from collate_test10; -- fail
+ERROR:  could not determine which collation to use for string comparison
+HINT:  Use the COLLATE clause to set the collation explicitly.
+select x || y from collate_test10; -- ok, because || is not collation aware
+ ?column? 
+----------
+ hijhij
+ HIJHIJ
+(2 rows)
+
+select x, y from collate_test10 order by x || y; -- not so ok
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "tr_TR%icu"
+LINE 1: select x, y from collate_test10 order by x || y;
+                                                      ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+-- collation mismatch between recursive and non-recursive term
+WITH RECURSIVE foo(x) AS
+   (SELECT x FROM (VALUES('a' COLLATE "en_US%icu"),('b')) t(x)
+   UNION ALL
+   SELECT (x || 'c') COLLATE "de_DE%icu" FROM foo WHERE length(x) < 10)
+SELECT * FROM foo;
+ERROR:  recursive query "foo" column 1 has collation "en_US%icu" in non-recursive term but collation "de_DE%icu" overall
+LINE 2:    (SELECT x FROM (VALUES('a' COLLATE "en_US%icu"),('b')) t(...
+                   ^
+HINT:  Use the COLLATE clause to set the collation of the non-recursive term.
+-- casting
+SELECT CAST('42' AS text COLLATE "C");
+ERROR:  syntax error at or near "COLLATE"
+LINE 1: SELECT CAST('42' AS text COLLATE "C");
+                                 ^
+SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, CAST(b AS varchar) FROM collate_test2 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, CAST(b AS varchar) FROM collate_test3 ORDER BY 2;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+-- propagation of collation in SQL functions (inlined and non-inlined cases)
+-- and plpgsql functions too
+CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 $$;
+CREATE FUNCTION mylt_noninline (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 limit 1 $$;
+CREATE FUNCTION mylt_plpgsql (text, text) RETURNS boolean LANGUAGE plpgsql
+    AS $$ begin return $1 < $2; end $$;
+SELECT a.b AS a, b.b AS b, a.b < b.b AS lt,
+       mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b)
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+  a  |  b  | lt | mylt | mylt_noninline | mylt_plpgsql 
+-----+-----+----+------+----------------+--------------
+ abc | abc | f  | f    | f              | f
+ abc | ABC | t  | t    | t              | t
+ abc | äbc | t  | t    | t              | t
+ abc | bbc | t  | t    | t              | t
+ ABC | abc | f  | f    | f              | f
+ ABC | ABC | f  | f    | f              | f
+ ABC | äbc | t  | t    | t              | t
+ ABC | bbc | t  | t    | t              | t
+ äbc | abc | f  | f    | f              | f
+ äbc | ABC | f  | f    | f              | f
+ äbc | äbc | f  | f    | f              | f
+ äbc | bbc | t  | t    | t              | t
+ bbc | abc | f  | f    | f              | f
+ bbc | ABC | f  | f    | f              | f
+ bbc | äbc | f  | f    | f              | f
+ bbc | bbc | f  | f    | f              | f
+(16 rows)
+
+SELECT a.b AS a, b.b AS b, a.b < b.b COLLATE "C" AS lt,
+       mylt(a.b, b.b COLLATE "C"), mylt_noninline(a.b, b.b COLLATE "C"),
+       mylt_plpgsql(a.b, b.b COLLATE "C")
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+  a  |  b  | lt | mylt | mylt_noninline | mylt_plpgsql 
+-----+-----+----+------+----------------+--------------
+ abc | abc | f  | f    | f              | f
+ abc | ABC | f  | f    | f              | f
+ abc | äbc | t  | t    | t              | t
+ abc | bbc | t  | t    | t              | t
+ ABC | abc | t  | t    | t              | t
+ ABC | ABC | f  | f    | f              | f
+ ABC | äbc | t  | t    | t              | t
+ ABC | bbc | t  | t    | t              | t
+ äbc | abc | f  | f    | f              | f
+ äbc | ABC | f  | f    | f              | f
+ äbc | äbc | f  | f    | f              | f
+ äbc | bbc | f  | f    | f              | f
+ bbc | abc | f  | f    | f              | f
+ bbc | ABC | f  | f    | f              | f
+ bbc | äbc | t  | t    | t              | t
+ bbc | bbc | f  | f    | f              | f
+(16 rows)
+
+-- collation override in plpgsql
+CREATE FUNCTION mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+SELECT mylt2('a', 'B' collate "en_US") as t, mylt2('a', 'B' collate "C") as f;
+ t | f 
+---+---
+ t | f
+(1 row)
+
+CREATE OR REPLACE FUNCTION
+  mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text COLLATE "POSIX" := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+SELECT mylt2('a', 'B') as f;
+ f 
+---
+ f
+(1 row)
+
+SELECT mylt2('a', 'B' collate "C") as fail; -- conflicting collations
+ERROR:  could not determine which collation to use for string comparison
+HINT:  Use the COLLATE clause to set the collation explicitly.
+CONTEXT:  PL/pgSQL function mylt2(text,text) line 6 at RETURN
+SELECT mylt2('a', 'B' collate "POSIX") as f;
+ f 
+---
+ f
+(1 row)
+
+-- polymorphism
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test1)) ORDER BY 1;
+ unnest 
+--------
+ abc
+ ABC
+ äbc
+ bbc
+(4 rows)
+
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test2)) ORDER BY 1;
+ unnest 
+--------
+ abc
+ ABC
+ bbc
+ äbc
+(4 rows)
+
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test3)) ORDER BY 1;
+ unnest 
+--------
+ ABC
+ abc
+ bbc
+ äbc
+(4 rows)
+
+CREATE FUNCTION dup (anyelement) RETURNS anyelement
+    AS 'select $1' LANGUAGE sql;
+SELECT a, dup(b) FROM collate_test1 ORDER BY 2;
+ a | dup 
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, dup(b) FROM collate_test2 ORDER BY 2;
+ a | dup 
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, dup(b) FROM collate_test3 ORDER BY 2;
+ a | dup 
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+-- indexes
+CREATE INDEX collate_test1_idx1 ON collate_test1 (b);
+CREATE INDEX collate_test1_idx2 ON collate_test1 (b COLLATE "C");
+CREATE INDEX collate_test1_idx3 ON collate_test1 ((b COLLATE "C")); -- this is different grammatically
+CREATE INDEX collate_test1_idx4 ON collate_test1 (((b||'foo') COLLATE "POSIX"));
+CREATE INDEX collate_test1_idx5 ON collate_test1 (a COLLATE "C"); -- fail
+ERROR:  collations are not supported by type integer
+CREATE INDEX collate_test1_idx6 ON collate_test1 ((a COLLATE "C")); -- fail
+ERROR:  collations are not supported by type integer
+LINE 1: ...ATE INDEX collate_test1_idx6 ON collate_test1 ((a COLLATE "C...
+                                                             ^
+SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_test%_idx%' ORDER BY 1;
+      relname       |                                           pg_get_indexdef                                           
+--------------------+-----------------------------------------------------------------------------------------------------
+ collate_test1_idx1 | CREATE INDEX collate_test1_idx1 ON collate_test1 USING btree (b)
+ collate_test1_idx2 | CREATE INDEX collate_test1_idx2 ON collate_test1 USING btree (b COLLATE "C")
+ collate_test1_idx3 | CREATE INDEX collate_test1_idx3 ON collate_test1 USING btree (b COLLATE "C")
+ collate_test1_idx4 | CREATE INDEX collate_test1_idx4 ON collate_test1 USING btree (((b || 'foo'::text)) COLLATE "POSIX")
+(4 rows)
+
+-- schema manipulation commands
+CREATE ROLE regress_test_role;
+CREATE SCHEMA test_schema;
+-- We need to do this this way to cope with varying names for encodings:
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test0 (locale = ' ||
+          quote_literal(current_setting('lc_collate')) || ');';
+END
+$$;
+CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
+ERROR:  collation "test0" for encoding "UTF8" already exists
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
+          quote_literal(current_setting('lc_collate')) ||
+          ', lc_ctype = ' ||
+          quote_literal(current_setting('lc_ctype')) || ');';
+END
+$$;
+CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+ERROR:  parameter "lc_ctype" must be specified
+CREATE COLLATION testx (locale = 'nonsense'); -- fail
+ERROR:  could not create locale "nonsense": No such file or directory
+DETAIL:  The operating system could not find any locale data for the locale name "nonsense".
+CREATE COLLATION test4 FROM nonsense;
+ERROR:  collation "nonsense" for encoding "UTF8" does not exist
+CREATE COLLATION test5 FROM test0;
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
+ collname 
+----------
+ test0
+ test1
+ test5
+(3 rows)
+
+ALTER COLLATION test1 RENAME TO test11;
+ALTER COLLATION test0 RENAME TO test11; -- fail
+ERROR:  collation "test11" for encoding "UTF8" already exists in schema "public"
+ALTER COLLATION test1 RENAME TO test22; -- fail
+ERROR:  collation "test1" for encoding "UTF8" does not exist
+ALTER COLLATION test11 OWNER TO regress_test_role;
+ALTER COLLATION test11 OWNER TO nonsense;
+ERROR:  role "nonsense" does not exist
+ALTER COLLATION test11 SET SCHEMA test_schema;
+COMMENT ON COLLATION test0 IS 'US English';
+SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
+    FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
+    WHERE collname LIKE 'test%'
+    ORDER BY 1;
+ collname |   nspname   | obj_description 
+----------+-------------+-----------------
+ test0    | public      | US English
+ test11   | test_schema | 
+ test5    | public      | 
+(3 rows)
+
+DROP COLLATION test0, test_schema.test11, test5;
+DROP COLLATION test0; -- fail
+ERROR:  collation "test0" for encoding "UTF8" does not exist
+DROP COLLATION IF EXISTS test0;
+NOTICE:  collation "test0" does not exist, skipping
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
+ collname 
+----------
+(0 rows)
+
+DROP SCHEMA test_schema;
+DROP ROLE regress_test_role;
+-- dependencies
+CREATE COLLATION test0 FROM "C";
+CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
+CREATE DOMAIN collate_dep_dom1 AS text COLLATE test0;
+CREATE TYPE collate_dep_test2 AS (x int, y text COLLATE test0);
+CREATE VIEW collate_dep_test3 AS SELECT text 'foo' COLLATE test0 AS foo;
+CREATE TABLE collate_dep_test4t (a int, b text);
+CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
+DROP COLLATION test0 RESTRICT; -- fail
+ERROR:  cannot drop collation test0 because other objects depend on it
+DETAIL:  table collate_dep_test1 column b depends on collation test0
+type collate_dep_dom1 depends on collation test0
+composite type collate_dep_test2 column y depends on collation test0
+view collate_dep_test3 depends on collation test0
+index collate_dep_test4i depends on collation test0
+HINT:  Use DROP ... CASCADE to drop the dependent objects too.
+DROP COLLATION test0 CASCADE;
+NOTICE:  drop cascades to 5 other objects
+DETAIL:  drop cascades to table collate_dep_test1 column b
+drop cascades to type collate_dep_dom1
+drop cascades to composite type collate_dep_test2 column y
+drop cascades to view collate_dep_test3
+drop cascades to index collate_dep_test4i
+\d collate_dep_test1
+Table "public.collate_dep_test1"
+ Column |  Type   | Modifiers 
+--------+---------+-----------
+ a      | integer | 
+
+\d collate_dep_test2
+Composite type "public.collate_dep_test2"
+ Column |  Type   | Modifiers 
+--------+---------+-----------
+ x      | integer | 
+
+DROP TABLE collate_dep_test1, collate_dep_test4t;
+DROP TYPE collate_dep_test2;
+-- test range types and collations
+create type textrange_c as range(subtype=text, collation="C");
+create type textrange_en_us as range(subtype=text, collation="en_US%icu");
+select textrange_c('A','Z') @> 'b'::text;
+ ?column? 
+----------
+ f
+(1 row)
+
+select textrange_en_us('A','Z') @> 'b'::text;
+ ?column? 
+----------
+ t
+(1 row)
+
+drop type textrange_c;
+drop type textrange_en_us;
diff --git a/src/test/regress/sql/collate.icu.sql b/src/test/regress/sql/collate.icu.sql
new file mode 100644
index 0000000..874d2f8
--- /dev/null
+++ b/src/test/regress/sql/collate.icu.sql
@@ -0,0 +1,398 @@
+/*
+ * This test is for Linux/glibc systems and assumes that a full set of
+ * locales is installed.  It must be run in a database with UTF-8 encoding,
+ * because other encodings don't support all the characters used.
+ */
+
+SET client_encoding TO UTF8;
+
+
+CREATE TABLE collate_test1 (
+    a int,
+    b text COLLATE "en_US%icu" NOT NULL
+);
+
+\d collate_test1
+
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "ja_JP.eucjp%icu"
+);
+
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "foo%icu"
+);
+
+CREATE TABLE collate_test_fail (
+    a int COLLATE "en_US%icu",
+    b text
+);
+
+CREATE TABLE collate_test_like (
+    LIKE collate_test1
+);
+
+\d collate_test_like
+
+CREATE TABLE collate_test2 (
+    a int,
+    b text COLLATE "sv_SE%icu"
+);
+
+CREATE TABLE collate_test3 (
+    a int,
+    b text COLLATE "C"
+);
+
+INSERT INTO collate_test1 VALUES (1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');
+INSERT INTO collate_test2 SELECT * FROM collate_test1;
+INSERT INTO collate_test3 SELECT * FROM collate_test1;
+
+SELECT * FROM collate_test1 WHERE b >= 'bbc';
+SELECT * FROM collate_test2 WHERE b >= 'bbc';
+SELECT * FROM collate_test3 WHERE b >= 'bbc';
+SELECT * FROM collate_test3 WHERE b >= 'BBC';
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+SELECT * FROM collate_test1 WHERE b >= 'bbc' COLLATE "C";
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US%icu";
+
+
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE%icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE%icu"; -- fails
+CREATE TABLE collate_test4 (
+    a int,
+    b testdomain_sv
+);
+INSERT INTO collate_test4 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test4 ORDER BY b;
+
+CREATE TABLE collate_test5 (
+    a int,
+    b testdomain_sv COLLATE "en_US%icu"
+);
+INSERT INTO collate_test5 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test5 ORDER BY b;
+
+
+SELECT a, b FROM collate_test1 ORDER BY b;
+SELECT a, b FROM collate_test2 ORDER BY b;
+SELECT a, b FROM collate_test3 ORDER BY b;
+
+SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+
+-- star expansion
+SELECT * FROM collate_test1 ORDER BY b;
+SELECT * FROM collate_test2 ORDER BY b;
+SELECT * FROM collate_test3 ORDER BY b;
+
+-- constant expression folding
+SELECT 'bbc' COLLATE "en_US%icu" > 'äbc' COLLATE "en_US%icu" AS "true";
+SELECT 'bbc' COLLATE "sv_SE%icu" > 'äbc' COLLATE "sv_SE%icu" AS "false";
+
+-- upper/lower
+
+CREATE TABLE collate_test10 (
+    a int,
+    x text COLLATE "en_US%icu",
+    y text COLLATE "tr_TR%icu"
+);
+
+INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
+
+SELECT a, lower(x), lower(y), upper(x), upper(y), initcap(x), initcap(y) FROM collate_test10;
+SELECT a, lower(x COLLATE "C"), lower(y COLLATE "C") FROM collate_test10;
+
+SELECT a, x, y FROM collate_test10 ORDER BY lower(y), a;
+
+-- LIKE/ILIKE
+
+SELECT * FROM collate_test1 WHERE b LIKE 'abc';
+SELECT * FROM collate_test1 WHERE b LIKE 'abc%';
+SELECT * FROM collate_test1 WHERE b LIKE '%bc%';
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc';
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';
+SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
+
+SELECT 'Türkiye' COLLATE "en_US%icu" ILIKE '%KI%' AS "true";
+SELECT 'Türkiye' COLLATE "tr_TR%icu" ILIKE '%KI%' AS "false";
+
+SELECT 'bıt' ILIKE 'BIT' COLLATE "en_US%icu" AS "false";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "tr_TR%icu" AS "true";
+
+-- The following actually exercises the selectivity estimation for ILIKE.
+SELECT relname FROM pg_class WHERE relname ILIKE 'abc%';
+
+-- regular expressions
+
+SELECT * FROM collate_test1 WHERE b ~ '^abc$';
+SELECT * FROM collate_test1 WHERE b ~ '^abc';
+SELECT * FROM collate_test1 WHERE b ~ 'bc';
+SELECT * FROM collate_test1 WHERE b ~* '^abc$';
+SELECT * FROM collate_test1 WHERE b ~* '^abc';
+SELECT * FROM collate_test1 WHERE b ~* 'bc';
+
+SELECT 'Türkiye' COLLATE "en_US%icu" ~* 'KI' AS "true";
+SELECT 'Türkiye' COLLATE "tr_TR%icu" ~* 'KI' AS "false";
+
+SELECT 'bıt' ~* 'BIT' COLLATE "en_US%icu" AS "false";
+SELECT 'bıt' ~* 'BIT' COLLATE "tr_TR%icu" AS "true";
+
+-- The following actually exercises the selectivity estimation for ~*.
+SELECT relname FROM pg_class WHERE relname ~* '^abc';
+
+
+-- to_char
+
+SET lc_time TO 'tr_TR';
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR%icu");
+
+
+-- backwards parsing
+
+CREATE VIEW collview1 AS SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+CREATE VIEW collview2 AS SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+CREATE VIEW collview3 AS SELECT a, lower((x || x) COLLATE "C") FROM collate_test10;
+
+SELECT table_name, view_definition FROM information_schema.views
+  WHERE table_name LIKE 'collview%' ORDER BY 1;
+
+
+-- collation propagation in various expression types
+
+SELECT a, coalesce(b, 'foo') FROM collate_test1 ORDER BY 2;
+SELECT a, coalesce(b, 'foo') FROM collate_test2 ORDER BY 2;
+SELECT a, coalesce(b, 'foo') FROM collate_test3 ORDER BY 2;
+SELECT a, lower(coalesce(x, 'foo')), lower(coalesce(y, 'foo')) FROM collate_test10;
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;
+SELECT a, b, greatest(b, 'CCC') FROM collate_test2 ORDER BY 3;
+SELECT a, b, greatest(b, 'CCC') FROM collate_test3 ORDER BY 3;
+SELECT a, x, y, lower(greatest(x, 'foo')), lower(greatest(y, 'foo')) FROM collate_test10;
+
+SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;
+SELECT a, nullif(b, 'abc') FROM collate_test2 ORDER BY 2;
+SELECT a, nullif(b, 'abc') FROM collate_test3 ORDER BY 2;
+SELECT a, lower(nullif(x, 'foo')), lower(nullif(y, 'foo')) FROM collate_test10;
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test1 ORDER BY 2;
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test2 ORDER BY 2;
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test3 ORDER BY 2;
+
+CREATE DOMAIN testdomain AS text;
+SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;
+SELECT a, b::testdomain FROM collate_test2 ORDER BY 2;
+SELECT a, b::testdomain FROM collate_test3 ORDER BY 2;
+SELECT a, b::testdomain_sv FROM collate_test3 ORDER BY 2;
+SELECT a, lower(x::testdomain), lower(y::testdomain) FROM collate_test10;
+
+SELECT min(b), max(b) FROM collate_test1;
+SELECT min(b), max(b) FROM collate_test2;
+SELECT min(b), max(b) FROM collate_test3;
+
+SELECT array_agg(b ORDER BY b) FROM collate_test1;
+SELECT array_agg(b ORDER BY b) FROM collate_test2;
+SELECT array_agg(b ORDER BY b) FROM collate_test3;
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test1 ORDER BY 2;
+SELECT a, b FROM collate_test2 UNION SELECT a, b FROM collate_test2 ORDER BY 2;
+SELECT a, b FROM collate_test3 WHERE a < 4 INTERSECT SELECT a, b FROM collate_test3 WHERE a > 1 ORDER BY 2;
+SELECT a, b FROM collate_test3 EXCEPT SELECT a, b FROM collate_test3 WHERE a < 2 ORDER BY 2;
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- ok
+SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+SELECT a, b COLLATE "C" FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- ok
+SELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+
+CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- fail
+
+-- ideally this would be a parse-time error, but for now it must be run-time:
+select x < y from collate_test10; -- fail
+select x || y from collate_test10; -- ok, because || is not collation aware
+select x, y from collate_test10 order by x || y; -- not so ok
+
+-- collation mismatch between recursive and non-recursive term
+WITH RECURSIVE foo(x) AS
+   (SELECT x FROM (VALUES('a' COLLATE "en_US%icu"),('b')) t(x)
+   UNION ALL
+   SELECT (x || 'c') COLLATE "de_DE%icu" FROM foo WHERE length(x) < 10)
+SELECT * FROM foo;
+
+
+-- casting
+
+SELECT CAST('42' AS text COLLATE "C");
+
+SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;
+SELECT a, CAST(b AS varchar) FROM collate_test2 ORDER BY 2;
+SELECT a, CAST(b AS varchar) FROM collate_test3 ORDER BY 2;
+
+
+-- propagation of collation in SQL functions (inlined and non-inlined cases)
+-- and plpgsql functions too
+
+CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 $$;
+
+CREATE FUNCTION mylt_noninline (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 limit 1 $$;
+
+CREATE FUNCTION mylt_plpgsql (text, text) RETURNS boolean LANGUAGE plpgsql
+    AS $$ begin return $1 < $2; end $$;
+
+SELECT a.b AS a, b.b AS b, a.b < b.b AS lt,
+       mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b)
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+
+SELECT a.b AS a, b.b AS b, a.b < b.b COLLATE "C" AS lt,
+       mylt(a.b, b.b COLLATE "C"), mylt_noninline(a.b, b.b COLLATE "C"),
+       mylt_plpgsql(a.b, b.b COLLATE "C")
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+
+
+-- collation override in plpgsql
+
+CREATE FUNCTION mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+
+SELECT mylt2('a', 'B' collate "en_US") as t, mylt2('a', 'B' collate "C") as f;
+
+CREATE OR REPLACE FUNCTION
+  mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text COLLATE "POSIX" := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+
+SELECT mylt2('a', 'B') as f;
+SELECT mylt2('a', 'B' collate "C") as fail; -- conflicting collations
+SELECT mylt2('a', 'B' collate "POSIX") as f;
+
+
+-- polymorphism
+
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test1)) ORDER BY 1;
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test2)) ORDER BY 1;
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test3)) ORDER BY 1;
+
+CREATE FUNCTION dup (anyelement) RETURNS anyelement
+    AS 'select $1' LANGUAGE sql;
+
+SELECT a, dup(b) FROM collate_test1 ORDER BY 2;
+SELECT a, dup(b) FROM collate_test2 ORDER BY 2;
+SELECT a, dup(b) FROM collate_test3 ORDER BY 2;
+
+
+-- indexes
+
+CREATE INDEX collate_test1_idx1 ON collate_test1 (b);
+CREATE INDEX collate_test1_idx2 ON collate_test1 (b COLLATE "C");
+CREATE INDEX collate_test1_idx3 ON collate_test1 ((b COLLATE "C")); -- this is different grammatically
+CREATE INDEX collate_test1_idx4 ON collate_test1 (((b||'foo') COLLATE "POSIX"));
+
+CREATE INDEX collate_test1_idx5 ON collate_test1 (a COLLATE "C"); -- fail
+CREATE INDEX collate_test1_idx6 ON collate_test1 ((a COLLATE "C")); -- fail
+
+SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_test%_idx%' ORDER BY 1;
+
+
+-- schema manipulation commands
+
+CREATE ROLE regress_test_role;
+CREATE SCHEMA test_schema;
+
+-- We need to do this this way to cope with varying names for encodings:
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test0 (locale = ' ||
+          quote_literal(current_setting('lc_collate')) || ');';
+END
+$$;
+CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
+          quote_literal(current_setting('lc_collate')) ||
+          ', lc_ctype = ' ||
+          quote_literal(current_setting('lc_ctype')) || ');';
+END
+$$;
+CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+CREATE COLLATION testx (locale = 'nonsense'); -- fail
+
+CREATE COLLATION test4 FROM nonsense;
+CREATE COLLATION test5 FROM test0;
+
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
+
+ALTER COLLATION test1 RENAME TO test11;
+ALTER COLLATION test0 RENAME TO test11; -- fail
+ALTER COLLATION test1 RENAME TO test22; -- fail
+
+ALTER COLLATION test11 OWNER TO regress_test_role;
+ALTER COLLATION test11 OWNER TO nonsense;
+ALTER COLLATION test11 SET SCHEMA test_schema;
+
+COMMENT ON COLLATION test0 IS 'US English';
+
+SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
+    FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
+    WHERE collname LIKE 'test%'
+    ORDER BY 1;
+
+DROP COLLATION test0, test_schema.test11, test5;
+DROP COLLATION test0; -- fail
+DROP COLLATION IF EXISTS test0;
+
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
+
+DROP SCHEMA test_schema;
+DROP ROLE regress_test_role;
+
+
+-- dependencies
+
+CREATE COLLATION test0 FROM "C";
+
+CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
+CREATE DOMAIN collate_dep_dom1 AS text COLLATE test0;
+CREATE TYPE collate_dep_test2 AS (x int, y text COLLATE test0);
+CREATE VIEW collate_dep_test3 AS SELECT text 'foo' COLLATE test0 AS foo;
+CREATE TABLE collate_dep_test4t (a int, b text);
+CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
+
+DROP COLLATION test0 RESTRICT; -- fail
+DROP COLLATION test0 CASCADE;
+
+\d collate_dep_test1
+\d collate_dep_test2
+
+DROP TABLE collate_dep_test1, collate_dep_test4t;
+DROP TYPE collate_dep_test2;
+
+-- test range types and collations
+
+create type textrange_c as range(subtype=text, collation="C");
+create type textrange_en_us as range(subtype=text, collation="en_US%icu");
+
+select textrange_c('A','Z') @> 'b'::text;
+select textrange_en_us('A','Z') @> 'b'::text;
+
+drop type textrange_c;
+drop type textrange_en_us;

Craig Ringer

craig@2ndquadrant.com

over 9 years ago

In reply to: Peter Eisentraut (#1)

Re: ICU integration

On 31 August 2016 at 10:46, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Here is a patch I've been working on to allow the use of ICU for sorting
and other locale things.

Great to see you working on this. We've got some icky head-in-the-sand
issues in this area when it comes to OS upgrades, cross-OS-release
replication, etc, and having more control over our locale handling
would be nice. It should also get rid of the Windows-vs-*nix locale
naming wart.

Speaking of which, have you had a chance to try it on Windows yet? If
not, I'm happy to offer some pointers and a working test env.

How stable are the UCU locales? Most importantly, does ICU offer any
way to "pin" a locale version, so we can say "we want de_DE as it was
in ICU 4.6" and get consistent behaviour when the user sets up a
replica on some other system with ICU 4.8? Even if the German
government has changed its mind (again) about some details of the
language and 4.8 knows about the changes but 4.4 doesn't?

Otherwise we'll just have a new version of the same problem when it
comes to replication and upgrades. User upgrades ICU or replicates to
host with newer ICU and the data in indexes no longer correctly
reflects the runtime.

Even if ICU doesn't help solve this problem it's still valuable. I
just think it's something to think about.

We could always bundle a specific ICU version, but I know how well
that'd go down with distributors. They'd just ignore us and unbundle
it then complain about it. Not wholly without reason either; they
don't want to push security updates and bug fixes for bundled
libraries. Also, ICU isn't exactly a pocket-sized library we can stash
in some 3rdpty_libs/ dir .

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 9 years ago

In reply to: Craig Ringer (#2)

Re: ICU integration

On 8/30/16 11:27 PM, Craig Ringer wrote:

Speaking of which, have you had a chance to try it on Windows yet?

nope

How stable are the UCU locales? Most importantly, does ICU offer any
way to "pin" a locale version, so we can say "we want de_DE as it was
in ICU 4.6" and get consistent behaviour when the user sets up a
replica on some other system with ICU 4.8? Even if the German
government has changed its mind (again) about some details of the
language and 4.8 knows about the changes but 4.4 doesn't?

I forgot to mention this, but the patch adds a collversion column that
stores the collation version (provided by ICU). And then when you
upgrade ICU to something incompatible you get

+           if (numversion != collform->collversion)
+               ereport(WARNING,
+                       (errmsg("ICU collator version mismatch"),
+                        errdetail("The database was created using
version 0x%08X, the library provides version 0x%08X.",
+                                  (uint32) collform->collversion,
(uint32) numversion),
+                        errhint("Rebuild affected indexes, or build
PostgreSQL with the right version of ICU.")));

So you still need to manage this carefully, but at least you have a
chance to learn about it.

Suggestions for refining this are welcome.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Michael Paquier

michael.paquier@gmail.com

over 9 years ago

In reply to: Peter Eisentraut (#3)

Re: ICU integration

On Wed, Aug 31, 2016 at 1:12 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 8/30/16 11:27 PM, Craig Ringer wrote:

Speaking of which, have you had a chance to try it on Windows yet?

nope

+SELECT a, b FROM collate_test2 ORDER BY b;
+ a |  b
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
Be careful with non-ASCII characters in the regression tests, remember
for example that where jacana was not happy:
https://www.postgresql.org/message-id/CAB7nPqROd2MXqy_5+cZJVhW0wHrrz6P8jV_RSbLcrXRTwLh7tQ@mail.gmail.com
-- 
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@heroku.com

over 9 years ago

In reply to: Peter Eisentraut (#1)

Re: ICU integration

On Tue, Aug 30, 2016 at 7:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Here is a patch I've been working on to allow the use of ICU for sorting
and other locale things.

I'm really happy that you're working on this. This is more important
than is widely appreciated, and very important overall.

In a world where ICU becomes the defacto standard (i.e. is used on
major platforms by default), what remaining barriers are there to
documenting and enforcing the binary compatibility of replicas? I may
be mistaken, but offhand I can't think of any. Being able to describe
exactly what works and what doesn't is very important. After all,
failure to adhere to "the rules" today, such as they are, can leave
you with a subtly broken replica. I'd like to make that scenario
mechanically impossible, by locking everything down.

I'm not sure how well it will work to replace all the bits of LIKE and
regular expressions with ICU API calls. One problem is that ICU likes
to do case folding as a whole string, not by character. I need to do
more research about that.

My guess is that there are cultural reasons why it wants to operate on
a whole string, at least in some cases.

Also note that ICU locales are encoding-independent and don't support a
separate collcollate and collctype, so the existing catalog structure is
not optimal.

That makes more sense to me, personally. ICU very explicitly decouples
technical issues (like the representation of strxfrm() keys, and, I
gather, encoding) from cultural issues (the actual user-visible
behaviors). This allows us to use strxfrm()-style binary keys in
indexes directly, since they're versioned independently from their
underlying collation; they can add a new optimization to strxfrm()-key
generation to the next ICU version, and we can detect that and require
a REINDEX, even when the collation version itself (the user-visible
behaviors) are unchanged. I'm getting ahead of myself here, but that
does seem very useful.

The Unicode collation algorithm [1]http://unicode.org/reports/tr10 that ICU is directly based on
knows plenty about the requirements of indexing. It contains guidance
about equivalence vs. equality that we learned the hard way in commit
656beff5, for example.

Where it gets really interesting is what to do with the database
locales. They just set the global process locale. So in order to port
that to ICU we'd need to check every implicit use of the process locale
and tweak it. We could add a datcollprovider column or something. But
we also rely on the datctype setting to validate the encoding of the
database. Maybe we wouldn't need that anymore, but it sounds risky.

Not sure about that.

Whatever we come up with here needs to mesh well with the existing
conventions around collation versioning that ICU has, in the context
of various operating system packages in particular. We can arrange it
so that in practice, an ICU upgrade doesn't often break your indexes
due to a collation rule change; ICU is happy to have multiple versions
of a collation at a time, and you'll probably retain the old collation
version in ICU.

Even if your old collation version isn't available in a new ICU
release (which I think is unlikely in practice), or you downgrade ICU,
it might be possible to give guidance on how to download a "Collation
Resource Bundle" [2]http://site.icu-project.org/design/size/collation[3]http://userguide.icu-project.org/icudata -- Peter Geoghegan that *does* have the right collation version,
which presumably satisfies the requirement immediately.

Firebird already uses ICU. Maybe we have something to learn from them
here. In particular, where do they (by which I mean the ICU version
that Firebird links to) get its collations from in practice? I think
that the CLDR Data collations were at one time not even distributed
with ICU source. It might be a matter of individual OS packagers of
ICU deciding what exact CLDR data to use, which may or may not be of
any significant consequence in practice.

[1]: http://unicode.org/reports/tr10
[2]: http://site.icu-project.org/design/size/collation
[3]: http://userguide.icu-project.org/icudata -- Peter Geoghegan
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@heroku.com

over 9 years ago

In reply to: Peter Eisentraut (#1)

Re: ICU integration

On Tue, Aug 30, 2016 at 7:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

In initdb, I initialize the default collation set as before from the
`locale -a` output, but also add all available ICU locales with a "%icu"
appended (so "fr_FR%icu"). I suppose one could create a configuration
option perhaps in initdb to change the default so that, say, "fr_FR"
uses ICU and "fr_FR%posix" uses the old stuff.

I suspect that we'd be better off adding a mechanism for adding a new
collation after initdb runs, on a live production instance. Maybe that
part can come later.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Doug Doole

ddoole@salesforce.com

over 9 years ago

In reply to: Peter Eisentraut (#3)

Re: ICU integration

Hi all. I’m new to the PostgreSQL code and the mailing list, but I’ve had a lot of experience with using ICU in a different database product. So while I’m not up to speed on the code yet, I can offer some insights on using ICU.

On Aug 30, 2016, at 9:12 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

How stable are the UCU locales? Most importantly, does ICU offer any
way to "pin" a locale version, so we can say "we want de_DE as it was
in ICU 4.6" and get consistent behaviour when the user sets up a
replica on some other system with ICU 4.8? Even if the German
government has changed its mind (again) about some details of the
language and 4.8 knows about the changes but 4.4 doesn’t?

ICU explicitly does not provide stability in their locales and collations. We pushed them hard to provide this, but between changes to the CLDR data and changes to the ICU code it just wasn’t feasible for them to provide version to version stability.

What they do offer is a compile option when building ICU to version all their APIs. So instead of calling icu_foo() you’d call icu_foo46(). (Or something like this - it’s been a few years since I actually worked with the ICU code.) This ultimately allows you to load multiple versions of the ICU library into a single program and provide stability by calling the appropriate version of the library. (Unfortunately, the OS - at least my Linux box - only provides the generic version of ICU and not the version annotated APIs, which means a separate compile of ICU is needed.)

The catch with this is that it means you likely want to expose the version information. In another note it was suggested to use something like fr_FR%icu. If you want to pin it to a specific version of ICU, you’ll likely need something like fr_FR%icu46. (There’s nothing wrong with supporting fr_FR%icu to give users an easy way of saying “give me the latest and greatest”, but you’d probably want to harden it to a specific ICU version internally.)

I forgot to mention this, but the patch adds a collversion column that
stores the collation version (provided by ICU). And then when you
upgrade ICU to something incompatible you get
+           if (numversion != collform->collversion)
+               ereport(WARNING,
+                       (errmsg("ICU collator version mismatch"),
+                        errdetail("The database was created using
version 0x%08X, the library provides version 0x%08X.",
+                                  (uint32) collform->collversion,
(uint32) numversion),
+                        errhint("Rebuild affected indexes, or build
PostgreSQL with the right version of ICU.")));
So you still need to manage this carefully, but at least you have a
chance to learn about it.

Indexes are the obvious place where collation comes into play, and are relatively easy to address. But consider all the places where string comparisons can be done. For example, check constraints and referential constraints can depend on string comparisons. If the collation rules change because of a new version of ICU, the database can become inconsistent and will need a lot more work than an index rebuild.

Suggestions for refining this are welcome.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Doug Doole
Salesforce

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 9 years ago

In reply to: Michael Paquier (#4)

Re: ICU integration

On 8/31/16 12:32 AM, Michael Paquier wrote:

On Wed, Aug 31, 2016 at 1:12 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 8/30/16 11:27 PM, Craig Ringer wrote:

Speaking of which, have you had a chance to try it on Windows yet?

nope
+SELECT a, b FROM collate_test2 ORDER BY b;
+ a |  b
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)

It take that to mean, "it works".

Be careful with non-ASCII characters in the regression tests, remember
for example that where jacana was not happy:
/messages/by-id/CAB7nPqROd2MXqy_5+cZJVhW0wHrrz6P8jV_RSbLcrXRTwLh7tQ@mail.gmail.com

That thread didn't tell me much, except that the client encoding didn't
handle some of the characters. That doesn't seem specific to Windows.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 9 years ago

In reply to: Doug Doole (#7)

Re: ICU integration

On 8/31/16 4:24 PM, Doug Doole wrote:

ICU explicitly does not provide stability in their locales and collations. We pushed them hard to provide this, but between changes to the CLDR data and changes to the ICU code it just wasn’t feasible for them to provide version to version stability.

What they do offer is a compile option when building ICU to version all their APIs. So instead of calling icu_foo() you’d call icu_foo46(). (Or something like this - it’s been a few years since I actually worked with the ICU code.) This ultimately allows you to load multiple versions of the ICU library into a single program and provide stability by calling the appropriate version of the library. (Unfortunately, the OS - at least my Linux box - only provides the generic version of ICU and not the version annotated APIs, which means a separate compile of ICU is needed.)

The catch with this is that it means you likely want to expose the version information. In another note it was suggested to use something like fr_FR%icu. If you want to pin it to a specific version of ICU, you’ll likely need something like fr_FR%icu46. (There’s nothing wrong with supporting fr_FR%icu to give users an easy way of saying “give me the latest and greatest”, but you’d probably want to harden it to a specific ICU version internally.)

There are multiple things going on.

Collations in ICU are versioned. You can find out the version of the
collation you are currently using using an API call. A collation
version does not change during the life of a single version of ICU. But
it might well change in the next version of ICU, as bugs are fixed and
things are refined. There is no way in the API to call for a collation
of a specific version, since there is only one version of a collation in
a specific installation of ICU. So my implementation is that we store
the version of the collation in the catalog when we create the
collation, and if we later on find at run time that the collation is of
a different version, we warn about it.

The ICU ABI (not API) is also versioned. The way that this is done is
that all functions are actually macros to a versioned symbol. So
ucol_open() is actually a macro that expands to, say, ucol_open_57() in
ICU version 57. (They also got rid of a dot in their versions a while
ago.) It's basically hand-crafted symbol versioning. That way, you can
link with multiple versions of ICU at the same time. However, the
purpose of that, as I understand it, is so that plugins can have a
different version of ICU loaded than the main process or another plugin.
In terms of postgres using the right version of ICU, it doesn't buy
anything beyond what the soname mechanism does.

+           if (numversion != collform->collversion)
+               ereport(WARNING,
+                       (errmsg("ICU collator version mismatch"),
+                        errdetail("The database was created using
version 0x%08X, the library provides version 0x%08X.",
+                                  (uint32) collform->collversion,
(uint32) numversion),
+                        errhint("Rebuild affected indexes, or build
PostgreSQL with the right version of ICU.")));
So you still need to manage this carefully, but at least you have a
chance to learn about it.
Indexes are the obvious place where collation comes into play, and are relatively easy to address. But consider all the places where string comparisons can be done. For example, check constraints and referential constraints can depend on string comparisons. If the collation rules change because of a new version of ICU, the database can become inconsistent and will need a lot more work than an index rebuild.

We can refine the guidance. But indexes are the most important issue, I
think, because changing the sorting rules in the background makes data
silently disappear.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 9 years ago

In reply to: Peter Geoghegan (#5)

Re: ICU integration

On 8/31/16 1:32 PM, Peter Geoghegan wrote:

ICU is happy to have multiple versions
of a collation at a time, and you'll probably retain the old collation
version in ICU.

Even if your old collation version isn't available in a new ICU
release (which I think is unlikely in practice)

I think this is wrong, or I have misunderstood the ICU documentation.
There is no way to "choose" a collation version. You can only record
the one you have gotten and check that you get the same one next time.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Doug Doole

ddoole@salesforce.com

over 9 years ago

In reply to: Peter Eisentraut (#9)

Re: ICU integration

The ICU ABI (not API) is also versioned. The way that this is done is
that all functions are actually macros to a versioned symbol. So
ucol_open() is actually a macro that expands to, say, ucol_open_57() in
ICU version 57. (They also got rid of a dot in their versions a while
ago.) It's basically hand-crafted symbol versioning. That way, you can
link with multiple versions of ICU at the same time. However, the
purpose of that, as I understand it, is so that plugins can have a
different version of ICU loaded than the main process or another plugin.
In terms of postgres using the right version of ICU, it doesn't buy
anything beyond what the soname mechanism does.

You can access the versioned API as well, it's just not documented. (The
ICU team does support this - we worked very closely with them when doing
all this.) We exploited the versioned API when we learned that there is no
guarantee of backwards compatibility in collations. You can't just change a
collation under a user (at least that was our opinion) since it can cause
all sorts of problems. Refreshing a collation (especially on the fly) is a
lot more work than we were prepared to take on. So we exploited the
versioned APIs.

We carried the ICU version numbers around on our collation and locale IDs
(such as fr_FR%icu36) . The database would load multiple versions of the
ICU library so that something created with ICU 3.6 would always be
processed with ICU 3.6. This avoided the problems of trying to change the
rules on the user. (We'd always intended to provide tooling to allow the
user to move an existing object up to a newer version of ICU, but we never
got around to doing it.) In the code, this meant we were explicitly calling
the versioned API so that we could keep the calls straight. (Of course this
was abstracted in a set of our own locale functions so that the rest of the
engine was ignorant of the ICU library fun that was going on.)

We can refine the guidance. But indexes are the most important issue, I
think, because changing the sorting rules in the background makes data
silently disappear.

I'd say that collation is the most important issue, but collation impacts a
lot more than indexes.

Unfortunately as part of changing companies I had to leave my "screwy stuff
that has happened in collations" presentation behind so I don't have
concrete examples to point to, but I can cook up illustrative examples:

- Suppose in ICU X.X, AA = Å but in ICU Y.Y AA != Å. Further suppose there
was an RI constraint where the primary key used AA and the foreign key
used Å. If ICU was updated, the RI constraint between the rows would break,
leaving an orphaned foreign key.

- I can't remember the specific language but they had the collation rule
that "CH" was treated as a distinct entity between C and D. This gave the
order C < CG < CI < CZ < CH < D. Then they removed CH as special which gave
C < CG < CH < CI < CZ < D. Suppose there was the constraint CHECK (COL
BETWEEN 'C' AND 'CH'). Originally it would allow (almost) all strings that
started with C. After the change it the constraint would block everything
that started with CI - CZ leaving many rows that no longer qualify in the
database.

It could be argued that these are edge cases and not likely to be hit.
That's likely true for a lot of users. But for a user who hits this, their
database is going to be left in a mess.

--
Doug Doole

On Tue, Sep 6, 2016 at 8:37 AM Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

Show quoted text

On 8/31/16 4:24 PM, Doug Doole wrote:

ICU explicitly does not provide stability in their locales and

collations. We pushed them hard to provide this, but between changes to the
CLDR data and changes to the ICU code it just wasn’t feasible for them to
provide version to version stability.

What they do offer is a compile option when building ICU to version all

their APIs. So instead of calling icu_foo() you’d call icu_foo46(). (Or
something like this - it’s been a few years since I actually worked with
the ICU code.) This ultimately allows you to load multiple versions of the
ICU library into a single program and provide stability by calling the
appropriate version of the library. (Unfortunately, the OS - at least my
Linux box - only provides the generic version of ICU and not the version
annotated APIs, which means a separate compile of ICU is needed.)

The catch with this is that it means you likely want to expose the

version information. In another note it was suggested to use something like
fr_FR%icu. If you want to pin it to a specific version of ICU, you’ll
likely need something like fr_FR%icu46. (There’s nothing wrong with
supporting fr_FR%icu to give users an easy way of saying “give me the
latest and greatest”, but you’d probably want to harden it to a specific
ICU version internally.)

There are multiple things going on.

Collations in ICU are versioned. You can find out the version of the
collation you are currently using using an API call. A collation
version does not change during the life of a single version of ICU. But
it might well change in the next version of ICU, as bugs are fixed and
things are refined. There is no way in the API to call for a collation
of a specific version, since there is only one version of a collation in
a specific installation of ICU. So my implementation is that we store
the version of the collation in the catalog when we create the
collation, and if we later on find at run time that the collation is of
a different version, we warn about it.

The ICU ABI (not API) is also versioned. The way that this is done is
that all functions are actually macros to a versioned symbol. So
ucol_open() is actually a macro that expands to, say, ucol_open_57() in
ICU version 57. (They also got rid of a dot in their versions a while
ago.) It's basically hand-crafted symbol versioning. That way, you can
link with multiple versions of ICU at the same time. However, the
purpose of that, as I understand it, is so that plugins can have a
different version of ICU loaded than the main process or another plugin.
In terms of postgres using the right version of ICU, it doesn't buy
anything beyond what the soname mechanism does.
+           if (numversion != collform->collversion)
+               ereport(WARNING,
+                       (errmsg("ICU collator version mismatch"),
+                        errdetail("The database was created using
version 0x%08X, the library provides version 0x%08X.",
+                                  (uint32) collform->collversion,
(uint32) numversion),
+                        errhint("Rebuild affected indexes, or build
PostgreSQL with the right version of ICU.")));
So you still need to manage this carefully, but at least you have a
chance to learn about it.
Indexes are the obvious place where collation comes into play, and are
relatively easy to address. But consider all the places where string
comparisons can be done. For example, check constraints and referential
constraints can depend on string comparisons. If the collation rules change
because of a new version of ICU, the database can become inconsistent and
will need a lot more work than an index rebuild.

We can refine the guidance. But indexes are the most important issue, I
think, because changing the sorting rules in the background makes data
silently disappear.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#12

Peter Geoghegan

pg@heroku.com

over 9 years ago

In reply to: Doug Doole (#11)

Re: ICU integration

On Tue, Sep 6, 2016 at 10:40 AM, Doug Doole <ddoole@salesforce.com> wrote:

The ICU ABI (not API) is also versioned. The way that this is done is
that all functions are actually macros to a versioned symbol. So
ucol_open() is actually a macro that expands to, say, ucol_open_57() in
ICU version 57. (They also got rid of a dot in their versions a while
ago.) It's basically hand-crafted symbol versioning. That way, you can
link with multiple versions of ICU at the same time. However, the
purpose of that, as I understand it, is so that plugins can have a
different version of ICU loaded than the main process or another plugin.
In terms of postgres using the right version of ICU, it doesn't buy
anything beyond what the soname mechanism does.

You can access the versioned API as well, it's just not documented. (The ICU
team does support this - we worked very closely with them when doing all
this.) We exploited the versioned API when we learned that there is no
guarantee of backwards compatibility in collations. You can't just change a
collation under a user (at least that was our opinion) since it can cause
all sorts of problems. Refreshing a collation (especially on the fly) is a
lot more work than we were prepared to take on. So we exploited the
versioned APIs.

I originally got some of this information from the ICU Doxygen site
for the C API, which isn't great documentation, but also isn't bad. I
admit that there are certainly gaps in my understanding of how to
bridge our requirements with versioning to what ICU can give us.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Peter Geoghegan

pg@heroku.com

over 9 years ago

In reply to: Doug Doole (#11)

Re: ICU integration

On Tue, Sep 6, 2016 at 10:40 AM, Doug Doole <ddoole@salesforce.com> wrote:

- Suppose in ICU X.X, AA = Å but in ICU Y.Y AA != Å. Further suppose there
was an RI constraint where the primary key used AA and the foreign key used
Å. If ICU was updated, the RI constraint between the rows would break,
leaving an orphaned foreign key.

This isn't a problem for Postgres, or at least wouldn't be right now,
because we don't have case insensitive collations. So, we use a
strcmp()/memcmp() tie-breaker when strcoll() indicates equality, while
also making the general notion of text equality actually mean binary
equality. In short, we are aware that cases like this exist. IIRC
Unicode Technical Standard #10 independently recommends that this
tie-breaker strategy is one way of dealing with problems like this, in
a pinch, though I think we came up with the idea independently of that
recommendation. This was in response to a bug report over 10 years
ago.

I would like to get case insensitive collations some day, and was
really hoping that ICU would help. That being said, the need for a
strcmp() tie-breaker makes that hard. Oh well.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 9 years ago

In reply to: Doug Doole (#11)

Re: ICU integration

On 9/6/16 1:40 PM, Doug Doole wrote:

We carried the ICU version numbers around on our collation and locale
IDs (such as fr_FR%icu36) . The database would load multiple versions of
the ICU library so that something created with ICU 3.6 would always be
processed with ICU 3.6. This avoided the problems of trying to change
the rules on the user. (We'd always intended to provide tooling to allow
the user to move an existing object up to a newer version of ICU, but we
never got around to doing it.) In the code, this meant we were
explicitly calling the versioned API so that we could keep the calls
straight.

I understand that in principle, but I don't see operating system
providers shipping a bunch of ICU versions to facilitate that. They
will usually ship one.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Doug Doole

ddoole@salesforce.com

over 9 years ago

In reply to: Peter Eisentraut (#14)

Re: ICU integration

I understand that in principle, but I don't see operating system
providers shipping a bunch of ICU versions to facilitate that. They
will usually ship one.

Yep. If you want the protection I've described, you can't depend on the OS
copy of ICU. You need to have multiple ICU libraries that are
named/installed such that you can load the specific version you want. It
also means that you can have dependencies on versions of ICU that are no
longer supported. (In my previous project, we were shipping 3 copies of the
ICU library, going back to 2.x. Needless to say, we didn't add support for
every drop of ICU.)

On Wed, Sep 7, 2016 at 5:53 AM Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

Show quoted text

On 9/6/16 1:40 PM, Doug Doole wrote:

We carried the ICU version numbers around on our collation and locale
IDs (such as fr_FR%icu36) . The database would load multiple versions of
the ICU library so that something created with ICU 3.6 would always be
processed with ICU 3.6. This avoided the problems of trying to change
the rules on the user. (We'd always intended to provide tooling to allow
the user to move an existing object up to a newer version of ICU, but we
never got around to doing it.) In the code, this meant we were
explicitly calling the versioned API so that we could keep the calls
straight.

I understand that in principle, but I don't see operating system
providers shipping a bunch of ICU versions to facilitate that. They
will usually ship one.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#16

Doug Doole

ddoole@salesforce.com

over 9 years ago

In reply to: Peter Geoghegan (#13)

Re: ICU integration

This isn't a problem for Postgres, or at least wouldn't be right now,
because we don't have case insensitive collations.

I was wondering if Postgres might be that way. It does avoid the RI
constraint problem, but there are still troubles with range based
predicates. (My previous project wanted case/accent insensitive collations,
so we got to deal with it all.)

So, we use a strcmp()/memcmp() tie-breaker when strcoll() indicates
equality, while also making the general notion of text equality actually
mean binary equality.

We used a similar tie breaker in places. (e.g. Index keys needed to be
identical, not just equal. We also broke ties in sort to make its behaviour
more deterministic.)

I would like to get case insensitive collations some day, and was

really hoping that ICU would help. That being said, the need for a
strcmp() tie-breaker makes that hard. Oh well.

Prior to adding ICU to my previous project, it had the assumption that
equal meant identical as well. It turned out to be a lot easier to break
this assumption than I expected, but that code base had religiously used
its own string comparison function for user data - strcmp()/memcmp() was
never called for user data. (I don't know if the same can be said for
Postgres.) We found that very few places needed to be aware of values that
were equal but not identical. (Index and sort were the big two.)

Hopefully Postgres will be the same.

--
Doug Doole

#17

Alvaro Herrera

alvherre@2ndquadrant.com

over 9 years ago

In reply to: Doug Doole (#11)

Re: ICU integration

- I can't remember the specific language but they had the collation rule
that "CH" was treated as a distinct entity between C and D. This gave the
order C < CG < CI < CZ < CH < D. Then they removed CH as special which gave
C < CG < CH < CI < CZ < D. Suppose there was the constraint CHECK (COL
BETWEEN 'C' AND 'CH'). Originally it would allow (almost) all strings that
started with C. After the change it the constraint would block everything
that started with CI - CZ leaving many rows that no longer qualify in the
database.

(This was probably Spanish.)

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Peter Eisentraut (#14)

Re: ICU integration

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 9/6/16 1:40 PM, Doug Doole wrote:

We carried the ICU version numbers around on our collation and locale
IDs (such as fr_FR%icu36) . The database would load multiple versions of
the ICU library so that something created with ICU 3.6 would always be
processed with ICU 3.6. This avoided the problems of trying to change
the rules on the user. (We'd always intended to provide tooling to allow
the user to move an existing object up to a newer version of ICU, but we
never got around to doing it.) In the code, this meant we were
explicitly calling the versioned API so that we could keep the calls
straight.

I understand that in principle, but I don't see operating system
providers shipping a bunch of ICU versions to facilitate that. They
will usually ship one.

I agree with that estimate, and I would further venture that even if we
wanted to bundle ICU into our tarballs, distributors would rip it out
again on security grounds. I am dead certain Red Hat would do so; less
sure that other vendors have similar policies, but it seems likely.
They don't want to have to fix security bugs in more than one place.

This is a problem, if ICU won't guarantee cross-version compatibility,
because it destroys the argument that moving to ICU would offer us
collation behavior stability.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 9 years ago

In reply to: Tom Lane (#18)

Re: ICU integration

On 9/8/16 11:16 AM, Tom Lane wrote:

This is a problem, if ICU won't guarantee cross-version compatibility,
because it destroys the argument that moving to ICU would offer us
collation behavior stability.

It would offer a significant upgrade over the current situation.

First, it offers stability inside the same version. Whereas glibc might
change a collation in a minor upgrade, ICU won't do that. And the
postgres binary is bound to a major version of ICU by the soname (which
changes with every major release). So this would avoid the situation
that a simple OS update could break collations.

Second, it offers a way to detect that something has changed. With
glibc, you don't know anything unless you read the source diffs. With
ICU, you can compare the collation version before and after and at least
tell the user that they need to refresh indexes or whatever.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Peter Geoghegan

pg@heroku.com

over 9 years ago

In reply to: Tom Lane (#18)

2 attachment(s)

Re: ICU integration

On Thu, Sep 8, 2016 at 8:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I understand that in principle, but I don't see operating system
providers shipping a bunch of ICU versions to facilitate that. They
will usually ship one.

I agree with that estimate, and I would further venture that even if we
wanted to bundle ICU into our tarballs, distributors would rip it out
again on security grounds.

I agree that we're not going to bundle our own ICU. And, that
packagers have to be more or less on board with whatever plan we come
up with for any this to be of much practical value. The plan itself is
at least as important as the patch.

This is a problem, if ICU won't guarantee cross-version compatibility,
because it destroys the argument that moving to ICU would offer us
collation behavior stability.

Not exactly. Peter E. didn't seem to be aware that there is an ICU
collator versioning concept (perhaps I misunderstood, though). It
might be that in practice, the locales are very stable, so it almost
doesn't matter that it's annoying when they change. Note that
"collators" are versioned in a sophisticated way, not locales.

You can build the attached simple C program to see the versions of
available collators from each locale, as follows:

$ gcc icu-test.c -licui18n -licuuc -o icu-coll-versions
$ ./icu-coll-versions | head -n 20
Collator | ICU Version | UCA Version
-----------------------------------------------------------------------------
Afrikaans | 99-38-00-00 | 07-00-00-00
Afrikaans (Namibia) | 99-38-00-00 | 07-00-00-00
Afrikaans (South Africa) | 99-38-00-00 | 07-00-00-00
Aghem | 99-38-00-00 | 07-00-00-00
Aghem (Cameroon) | 99-38-00-00 | 07-00-00-00
Akan | 99-38-00-00 | 07-00-00-00
Akan (Ghana) | 99-38-00-00 | 07-00-00-00
Amharic | 99-38-00-00 | 07-00-00-00
Amharic (Ethiopia) | 99-38-00-00 | 07-00-00-00
Arabic | 99-38-1B-01 | 07-00-00-00
Arabic (World) | 99-38-1B-01 | 07-00-00-00
Arabic (United Arab Emirates) | 99-38-1B-01 | 07-00-00-00
Arabic (Bahrain) | 99-38-1B-01 | 07-00-00-00
Arabic (Djibouti) | 99-38-1B-01 | 07-00-00-00
Arabic (Algeria) | 99-38-1B-01 | 07-00-00-00
Arabic (Egypt) | 99-38-1B-01 | 07-00-00-00
Arabic (Western Sahara) | 99-38-1B-01 | 07-00-00-00
Arabic (Eritrea) | 99-38-1B-01 | 07-00-00-00

I also attach a full list from my Ubuntu 16.04 laptop. I'll try to
find some other system to generate output from, to see how close it
matches what I happen to have here.

"ICU version" here is an opaque 32-bit integer [1]https://ssl.icu-project.org/apiref/icu4c/ucol_8h.html#af756972781ac556a62e48cbd509ea4a6 -- Peter Geoghegan. I'd be interested
to see how much the output of this program differs from one major
version of ICU to the next. Collations will change. of course, but not
that often. It's not the end of the world if somebody has to REINDEX
when they change major OS version. It would be nice if everything just
continued to work with no further input from the user, but it's not
essential, assuming that collation are pretty stable in practice,
which I think they are. It is a total disaster if a mismatch in
collations is initially undetected, though.

Another issue that nobody has mentioned here, I think, is that the
glibc people just don't seem to care about our use-case (Carlos
O'Donnell basically said as much, during the strxfrm() debacle earlier
this year, but it wasn't limited to how we were relying on strxfrm()
at that time). Since it's almost certainly true that other major
database systems are critically reliant on ICU's strxfrm() agreeing
with strcoll (substitute ICU equivalent spellings), and issues beyond
that, it stands to reason that they take that stuff very seriously. It
would be really nice to get back abbreviated keys for collated text,
IMV. I think ICU gets us that. Even if we used ICU in exactly the same
way as we use the C standard library today, that general sense of
stability being critical that ICU has would still be a big advantage.
If ICU drops the ball on collation stability, or strxfrm() disagreeing
with strcoll(), it's a huge problem for lots of groups of people, not
just us.

[1]: https://ssl.icu-project.org/apiref/icu4c/ucol_8h.html#af756972781ac556a62e48cbd509ea4a6 -- Peter Geoghegan
--
Peter Geoghegan

#21

Craig Ringer

craig@2ndquadrant.com

over 9 years ago

In reply to: Peter Eisentraut (#19)

Re: ICU integration

On 9 September 2016 at 00:19, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 9/8/16 11:16 AM, Tom Lane wrote:

This is a problem, if ICU won't guarantee cross-version compatibility,
because it destroys the argument that moving to ICU would offer us
collation behavior stability.

It would offer a significant upgrade over the current situation.

First, it offers stability inside the same version. Whereas glibc might
change a collation in a minor upgrade, ICU won't do that. And the
postgres binary is bound to a major version of ICU by the soname (which
changes with every major release). So this would avoid the situation
that a simple OS update could break collations.

It also lets *users* and PostgreSQL-specific distributors bundle ICU
and get stable collations. We can't exactly bundle glibc.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

Tatsuo Ishii

ishii@sraoss.co.jp

over 9 years ago

In reply to: Craig Ringer (#21)

Re: ICU integration

On 9 September 2016 at 00:19, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 9/8/16 11:16 AM, Tom Lane wrote:

This is a problem, if ICU won't guarantee cross-version compatibility,
because it destroys the argument that moving to ICU would offer us
collation behavior stability.

It would offer a significant upgrade over the current situation.

First, it offers stability inside the same version. Whereas glibc might
change a collation in a minor upgrade, ICU won't do that. And the
postgres binary is bound to a major version of ICU by the soname (which
changes with every major release). So this would avoid the situation
that a simple OS update could break collations.

It also lets *users* and PostgreSQL-specific distributors bundle ICU
and get stable collations. We can't exactly bundle glibc.

I would like to know the fate of community RPMs because if PostgreSQL
does not include ICU source, the effort of integrating ICU is totally
up to packagers.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23

Craig Ringer

craig@2ndquadrant.com

over 9 years ago

In reply to: Tatsuo Ishii (#22)

Re: ICU integration

On 9 September 2016 at 08:51, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:

On 9 September 2016 at 00:19, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 9/8/16 11:16 AM, Tom Lane wrote:

This is a problem, if ICU won't guarantee cross-version compatibility,
because it destroys the argument that moving to ICU would offer us
collation behavior stability.

It would offer a significant upgrade over the current situation.

First, it offers stability inside the same version. Whereas glibc might
change a collation in a minor upgrade, ICU won't do that. And the
postgres binary is bound to a major version of ICU by the soname (which
changes with every major release). So this would avoid the situation
that a simple OS update could break collations.

It also lets *users* and PostgreSQL-specific distributors bundle ICU
and get stable collations. We can't exactly bundle glibc.

I would like to know the fate of community RPMs because if PostgreSQL
does not include ICU source, the effort of integrating ICU is totally
up to packagers.

CC'ing Devrim.

Personally I would be pretty reluctant to package libicu when it's
already in RH/Fedora. If it were bundled in Pg's source tree and a
private copy was built/installed by the build system so it was part of
the main postgresql-server package that'd be different. I can't really
imagine that being acceptable for upstream development though.

RH and Debian would instantly rip it out and replace it with their
packaged ICU anyway.

Pity ICU doesn't offer versioned collations within a single install.
Though I can understand why they don't.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

Peter Geoghegan

pg@heroku.com

over 9 years ago

In reply to: Craig Ringer (#23)

Re: ICU integration

On Thu, Sep 8, 2016 at 6:48 PM, Craig Ringer <craig@2ndquadrant.com> wrote:

Pity ICU doesn't offer versioned collations within a single install.
Though I can understand why they don't.

There are two separate issues with collator versioning. ICU can
probably be used in a way that clearly decouples these two issues,
which is very important. The first is that the rules of collations
change. The second is that the binary key that collators create (i.e.
the equivalent of strxfrm()) can change for various reasons that have
nothing to do with culture or natural languages -- purely technical
reasons. For example, they can add new optimizations to make
generating new binary keys faster. If there are bugs in how that
works, they can fix the bugs and increment the identifier [1]https://github.com/svn2github/libicu/blob/c43ec130ea0ee6cd565d87d70088e1d70d892f32/common/unicode/uvernum.h#L149, which
could allow Postgres to insist on a REINDEX (if abbreviated keys for
collated text were reenabled, although I don't think that problems
like that are limited to binary key generation).

So, to bring it back to that little program I wrote:

$ ./icu-coll-versions | head
Collator | ICU Version | UCA Version
-----------------------------------------------------------------------------
Afrikaans | 99-38-00-00 | 07-00-00-00
Afrikaans (Namibia) | 99-38-00-00 | 07-00-00-00
Afrikaans (South Africa) | 99-38-00-00 | 07-00-00-00
Aghem | 99-38-00-00 | 07-00-00-00
Aghem (Cameroon) | 99-38-00-00 | 07-00-00-00
Akan | 99-38-00-00 | 07-00-00-00
Akan (Ghana) | 99-38-00-00 | 07-00-00-00
Amharic | 99-38-00-00 | 07-00-00-00

Here, what appears as "ICU version" has the identifier [1]https://github.com/svn2github/libicu/blob/c43ec130ea0ee6cd565d87d70088e1d70d892f32/common/unicode/uvernum.h#L149 baked in,
although this is undocumented (it also has any "custom tailorings"
that might be used, say if we had user defined customizations to
collations, as Firebird apparently does [2]http://www.firebirdsql.org/refdocs/langrefupd25-ddl-collation.html [3]http://userguide.icu-project.org/collation/customization#TOC-Building-on-Existing-Locales). I'm pretty sure that
UCA version relates to a version of the Unicode collation algorithm,
and its associated DUCET table (this is all subject to ISO
standardization). I gather that a particular collation is actually
comprised of a base UCA version (and DUCET table -- I think that ICU
sometimes calls this the "root"), with custom tailorings that a locale
provides for a given culture or country. These collators may in turn
be further "tailored" to get that fancy user defined customization
stuff.

In principle, and assuming I haven't gotten something wrong, it ought
to be possible to unambiguously identify a collation based on a
matching UCA version (i.e. DUCET table), plus the collation tailorings
matching exactly, even across ICU versions that happen to be based on
the same UCA version (they only seem to put out a new UCA version
about once a year [4]http://unicode.org/reports/tr10/#Synch_14651_Table). It *might* be fine, practically speaking, to
assume that a collation with a matching iso-code and UCA version is
compatible forever and always across any ICU version. If not, it might
instead be feasible to write a custom fingerprinter for collation
tailorings that ran at initdb time. Maybe the tailorings, which are
abstract rules, could even be stored in system catalogs, so the only
thing that need match is ICU's UCA version (the "root" collators must
still match), since replicas may reconstruct the serialized tailorings
that comprise a collation as needed [5]https://ssl.icu-project.org/apiref/icu4c/ucol_8h.html#a1982f184bca8adaa848144a1959ff235[6]https://ssl.icu-project.org/apiref/icu4c/structUSerializedSet.html, since the tailoring that a
default collator for a locale uses isn't special, technically
speaking.

Of course, this is all pretty hand-wavey right now, and much more
research is needed. I am very intrigued about the idea of storing the
collators in the system catalogs wholesale, since ICU provides
facilities that make that seem possible. If a "serialized unicode set"
build from a collators tailoring rules, or, alternatively, a collator
saved as a binary representation [7]https://ssl.icu-project.org/apiref/icu4c/ucol_8h.html#a2719995a75ebed7aacc1419bb2b781db -- Peter Geoghegan were stored in the system
catalogs, perhaps it wouldn't matter as much that the stuff
distributed with different ICU versions didn't match, at least in
theory. It's unclear that the system catalog representation could be
usable with a fair cross section of ICU versions, but if it could then
that would be perfect. This also seems to be how Firebird style
user-defined tailorings might be implemented anyway, and it seems very
appealing to add that as a light layer on top of how the base system
works, if at all possible.

[1]: https://github.com/svn2github/libicu/blob/c43ec130ea0ee6cd565d87d70088e1d70d892f32/common/unicode/uvernum.h#L149
[2]: http://www.firebirdsql.org/refdocs/langrefupd25-ddl-collation.html
[3]: http://userguide.icu-project.org/collation/customization#TOC-Building-on-Existing-Locales
[4]: http://unicode.org/reports/tr10/#Synch_14651_Table
[5]: https://ssl.icu-project.org/apiref/icu4c/ucol_8h.html#a1982f184bca8adaa848144a1959ff235
[6]: https://ssl.icu-project.org/apiref/icu4c/structUSerializedSet.html
[7]: https://ssl.icu-project.org/apiref/icu4c/ucol_8h.html#a2719995a75ebed7aacc1419bb2b781db -- Peter Geoghegan
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25

Devrim Gündüz

devrim@gunduz.org

over 9 years ago

In reply to: Craig Ringer (#23)

Re: ICU integration

Hi,

On Fri, 2016-09-09 at 09:48 +0800, Craig Ringer wrote:

Personally I would be pretty reluctant to package libicu when it's
already in RH/Fedora. If it were bundled in Pg's source tree and a
private copy was built/installed by the build system so it was part of
the main postgresql-server package that'd be different.

Agreed. I did not read the whole thread (yet), but if this is something like
tzdata, I would personally want to use the libuci supplied by OS, like we do
for tzdata.

(That said, just checked EDB's ICU support. We currently ship our own libicu
there, as a part of EPAS, but I don't know the reasoning/history behind there.)

Regards,
--
Devrim GÜNDÜZ
EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR

#26

Magnus Hagander

magnus@hagander.net

over 9 years ago

In reply to: Peter Eisentraut (#19)

Re: ICU integration

On Thu, Sep 8, 2016 at 6:19 PM, Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

On 9/8/16 11:16 AM, Tom Lane wrote:

This is a problem, if ICU won't guarantee cross-version compatibility,
because it destroys the argument that moving to ICU would offer us
collation behavior stability.

It would offer a significant upgrade over the current situation.

First, it offers stability inside the same version. Whereas glibc might
change a collation in a minor upgrade, ICU won't do that. And the
postgres binary is bound to a major version of ICU by the soname (which
changes with every major release). So this would avoid the situation
that a simple OS update could break collations.

Second, it offers a way to detect that something has changed. With
glibc, you don't know anything unless you read the source diffs. With
ICU, you can compare the collation version before and after and at least
tell the user that they need to refresh indexes or whatever.

+1 on the importance of this last part.

We may not be able to handle it directly, but just being able to point out
to the user that "this index is incorrect, you have to reindex" and then
refuse to use the index until that has been done would be a *huge*
improvement. And it would definitely help solve an existing real-world
problem, which is what can happen when you restore a physical backup onto a
different version of an operating system at least.

Sure, it would be even better if we could automatically *deal* with it. But
failing in a loud and obvious way is a *lot* better than silently returning
incorrect data...

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#27

Craig Ringer

craig@2ndquadrant.com

over 9 years ago

In reply to: Magnus Hagander (#26)

Re: ICU integration

On 9 September 2016 at 16:21, Magnus Hagander <magnus@hagander.net> wrote:

On Thu, Sep 8, 2016 at 6:19 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 9/8/16 11:16 AM, Tom Lane wrote:

This is a problem, if ICU won't guarantee cross-version compatibility,
because it destroys the argument that moving to ICU would offer us
collation behavior stability.

It would offer a significant upgrade over the current situation.

First, it offers stability inside the same version. Whereas glibc might
change a collation in a minor upgrade, ICU won't do that. And the
postgres binary is bound to a major version of ICU by the soname (which
changes with every major release). So this would avoid the situation
that a simple OS update could break collations.

Second, it offers a way to detect that something has changed. With
glibc, you don't know anything unless you read the source diffs. With
ICU, you can compare the collation version before and after and at least
tell the user that they need to refresh indexes or whatever.

+1 on the importance of this last part.

We may not be able to handle it directly, but just being able to point out
to the user that "this index is incorrect, you have to reindex" and then
refuse to use the index until that has been done would be a *huge*
improvement. And it would definitely help solve an existing real-world
problem, which is what can happen when you restore a physical backup onto a
different version of an operating system at least.

Sure, it would be even better if we could automatically *deal* with it. But
failing in a loud and obvious way is a *lot* better than silently returning
incorrect data...

Yep, I strongly agree. That's part of why I think this is well worth
doing even though it doesn't look like it'll give us a full solution
for stable collations.

It's likely also a step or three toward case-insensitive
locales/collations, which I'm sure many people would like. Though far
from the whole picture.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 9 years ago

In reply to: Peter Geoghegan (#24)

Re: ICU integration

On 9/8/16 11:08 PM, Peter Geoghegan wrote:

In principle, and assuming I haven't gotten something wrong, it ought
to be possible to unambiguously identify a collation based on a
matching UCA version (i.e. DUCET table), plus the collation tailorings
matching exactly, even across ICU versions that happen to be based on
the same UCA version (they only seem to put out a new UCA version
about once a year [4]). It *might* be fine, practically speaking, to
assume that a collation with a matching iso-code and UCA version is
compatible forever and always across any ICU version. If not, it might
instead be feasible to write a custom fingerprinter for collation
tailorings that ran at initdb time.

The documentation [0]http://userguide.icu-project.org/collation/architecture#TOC-Versioning states that the collation version covers a broad
range of things. So I don't think these additional mechanisms you
describe are necessary.

[0]: http://userguide.icu-project.org/collation/architecture#TOC-Versioning

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 9 years ago

In reply to: Peter Geoghegan (#20)

Re: ICU integration

On 9/8/16 8:07 PM, Peter Geoghegan wrote:

Not exactly. Peter E. didn't seem to be aware that there is an ICU
collator versioning concept (perhaps I misunderstood, though).

You appear to have missed the part about the collversion column that my
patch adds. That's exactly the collator version of ICU.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30

Dave Page

dpage@pgadmin.org

over 9 years ago

In reply to: Devrim Gündüz (#25)

Re: ICU integration

On Fri, Sep 9, 2016 at 9:02 AM, Devrim Gündüz <devrim@gunduz.org> wrote:

Hi,

On Fri, 2016-09-09 at 09:48 +0800, Craig Ringer wrote:

Personally I would be pretty reluctant to package libicu when it's
already in RH/Fedora. If it were bundled in Pg's source tree and a
private copy was built/installed by the build system so it was part of
the main postgresql-server package that'd be different.

Agreed. I did not read the whole thread (yet), but if this is something like
tzdata, I would personally want to use the libuci supplied by OS, like we do
for tzdata.

(That said, just checked EDB's ICU support. We currently ship our own libicu
there, as a part of EPAS, but I don't know the reasoning/history behind there.)

We needed a specific version that was newer than that shipped with
RHEL 6 (or in EPEL) iirc.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Dave Page (#30)

Re: ICU integration

Dave Page <dpage@pgadmin.org> writes:

We needed a specific version that was newer than that shipped with
RHEL 6 (or in EPEL) iirc.

Sure hope that's not true of the currently-proposed patch :-(

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32

Dave Page

dpage@pgadmin.org

over 9 years ago

In reply to: Tom Lane (#31)

Re: ICU integration

On Fri, Sep 9, 2016 at 2:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dave Page <dpage@pgadmin.org> writes:

We needed a specific version that was newer than that shipped with
RHEL 6 (or in EPEL) iirc.

Sure hope that's not true of the currently-proposed patch :-(

Looking back at my old emails, apparently ICU 5.0 and later include
ucol_strcollUTF8() which avoids the need to convert UTF-8 characters
to 16 bit before sorting. RHEL 6 has the older 4.2 version of ICU.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33

Peter Geoghegan

pg@heroku.com

over 9 years ago

In reply to: Dave Page (#32)

Re: ICU integration

On Fri, Sep 9, 2016 at 6:39 AM, Dave Page <dpage@pgadmin.org> wrote:

Looking back at my old emails, apparently ICU 5.0 and later include
ucol_strcollUTF8() which avoids the need to convert UTF-8 characters
to 16 bit before sorting. RHEL 6 has the older 4.2 version of ICU.

At the risk of stating the obvious, there is a reason why ICU
traditionally worked with UTF-16 natively. It's the same reason why
many OSes and application frameworks (e.g., Java) use UTF-16
internally, even though UTF-8 is much more popular on the web. Which
is: there are certain low-level optimizations possible that are not
possible with UTF-8.

I'm not saying that it would be just as good if we were to not use the
UTF-8 optimized stuff that ICU now has. My point is that it's not
useful to prejudge whether or not performance will be acceptable based
on a factor like this, which is ultimately just an implementation
detail. The ICU patch either performs acceptably as a substitute for
something like glibc, or it does not.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34

Bruce Momjian

bruce@momjian.us

over 9 years ago

In reply to: Peter Eisentraut (#19)

Re: ICU integration

On Thu, Sep 8, 2016 at 12:19:39PM -0400, Peter Eisentraut wrote:

On 9/8/16 11:16 AM, Tom Lane wrote:

This is a problem, if ICU won't guarantee cross-version compatibility,
because it destroys the argument that moving to ICU would offer us
collation behavior stability.

It would offer a significant upgrade over the current situation.

First, it offers stability inside the same version. Whereas glibc might
change a collation in a minor upgrade, ICU won't do that. And the
postgres binary is bound to a major version of ICU by the soname (which
changes with every major release). So this would avoid the situation
that a simple OS update could break collations.

Uh, how do we know that ICU doesn't change collations in minor versions?
Couldn't we get into a case where the OS changes the ICU version or
collations more frequently than glibc does? Seems that would be a
negative.

I don't see how having our binary bound to a ICU major version gives us
any benefit.

It seems we are still hostage to the OS version.

Second, it offers a way to detect that something has changed. With
glibc, you don't know anything unless you read the source diffs. With
ICU, you can compare the collation version before and after and at least
tell the user that they need to refresh indexes or whatever.

Yes, that is a clear win.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35

Petr Jelinek

petr@2ndquadrant.com

over 9 years ago

In reply to: Peter Eisentraut (#1)

Re: ICU integration

On 31/08/16 04:46, Peter Eisentraut wrote:

Here is a patch I've been working on to allow the use of ICU for sorting
and other locale things.

Hi, first pass review of the patch (somewhat high level at this point).

First, there was some discussion about ICU versioning and collation
versioning and support for multiple versions of ICU at the same time. My
opinion on this is that the approach you have chosen is good, at least
for first iteration of the feature. We might want to support multiple
versions of the library in the future (I am not all that convinced of
that though), but we definitely don't at the moment. It is important to
have version checking against the data directory which you have done.

What I have done is extend collation objects with a collprovider column
that tells whether the collation is using POSIX (appropriate name?) or
ICU facilities. The pg_locale_t type is changed to a struct that
contains the provider-specific locale handles. Users of locale
information are changed to look into that struct for the appropriate
handle to use.

This sounds reasonable. I would prefer the POSIX to be named something
like SYSTEM though (think windows).

In initdb, I initialize the default collation set as before from the
`locale -a` output, but also add all available ICU locales with a "%icu"
appended (so "fr_FR%icu"). I suppose one could create a configuration
option perhaps in initdb to change the default so that, say, "fr_FR"
uses ICU and "fr_FR%posix" uses the old stuff.

I wonder if we should have both %icu and %posix unconditionally and then
have option to pick one of them for the list of collations without the
suffix (defaulting to posix for now but maybe not in the future).

That all works well enough for named collations and for sorting. The
thread about the FreeBSD ICU patch discusses some details of how to best
use the ICU APIs to do various aspects of the sorting, so I didn't focus
on that too much.

That's something for follow-up patches IMHO.

I'm not sure how well it will work to replace all the bits of LIKE and
regular expressions with ICU API calls. One problem is that ICU likes
to do case folding as a whole string, not by character. I need to do
more research about that.

Can't we use the regular expression api provided by ICU, or is that too
incompatible?

Another problem, which was also previously
discussed is that ICU does case folding in a locale-agnostic manner, so
it does not consider things such as the Turkish special cases. This is
per Unicode standard modulo weasel wording, but it breaks existing tests
at least.

I am quite sure it will break existing usecases as well. Don't know if
that's an issue if we keep multiple locale providers though. It's
expected that you get different behavior from them.

Where it gets really interesting is what to do with the database
locales. They just set the global process locale. So in order to port
that to ICU we'd need to check every implicit use of the process locale
and tweak it. We could add a datcollprovider column or something. But
we also rely on the datctype setting to validate the encoding of the
database. Maybe we wouldn't need that anymore, but it sounds risky.

I think we'll need to separate the process locale and the locale used
for string operations.

We could have a datcollation column that by OID references a collation
defined inside the database. With a background worker, we can log into
the database as it is being created and make adjustments, including
defining or adjusting collation definitions. This would open up
interesting new possibilities.

I don't really follow this but it sounds icky.

What is a way to go forward here? What's a minimal useful feature that
is future-proof? Just allow named collations referencing ICU for now?

I think that's minimal useful feature indeed, it will move us forward,
especially on systems where posix locales are not that well implemented
but will not be world changer just yet.

Is throwing out POSIX locales even for the process locale reasonable?

I don't think we can really do that, we'll need some kind of mapping
table between system locales and ICU locales (IIRC we build something
like that on windows as well). Maybe that mapping can be used for the
datctype to process locale conversion (datctype would then be provider
specific).

We also need to keep posix locale support anyway otherwise we'd break
pg_upgrade I think (at leas to the point that pg_upgrade where you have
to reindex whole database is much less feasible).

Oh, that case folding code in formatting.c needs some refactoring.
There are so many ifdefs there and it's repeated almost identically
three times, it's crazy to work in that.

Yuck, yes it wasn't pretty before and it's quite unreadable now, I think
this patch will have to do something about that.

The message about collate provider version change could use some
improvements though, ie we could try to find affected indexes, also
there might be check constraints affected, etc. If nothing else it
deserves explanation in the docs.

The icu conversion functions could use some docs.

I wonder if some of the pieces of the code with following pattern:
if (mylocale->provider == COLLPROVIDER_ICU)
do lot's of icu stuff
else
call posix function
should move the icu code to separate functions, it would certainly
improve overall readability.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36

Alvaro Herrera

alvherre@2ndquadrant.com

over 9 years ago

In reply to: Petr Jelinek (#35)

Re: ICU integration

Petr Jelinek wrote:

I'm not sure how well it will work to replace all the bits of LIKE and
regular expressions with ICU API calls. One problem is that ICU likes
to do case folding as a whole string, not by character. I need to do
more research about that.

Can't we use the regular expression api provided by ICU, or is that too
incompatible?

That probably won't fly very well -- it would be rather absurd to have
different regex behavior when ICU is compiled compared to when it's
not. Perhaps down the road we could have an extension that provides
ICU-based regex operators, but most likely not in core.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#37

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Alvaro Herrera (#36)

Re: ICU integration

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Petr Jelinek wrote:

I'm not sure how well it will work to replace all the bits of LIKE and
regular expressions with ICU API calls. One problem is that ICU likes
to do case folding as a whole string, not by character. I need to do
more research about that.

Can't we use the regular expression api provided by ICU, or is that too
incompatible?

That probably won't fly very well -- it would be rather absurd to have
different regex behavior when ICU is compiled compared to when it's
not. Perhaps down the road we could have an extension that provides
ICU-based regex operators, but most likely not in core.

The regex stuff does not handle case-insensitivity by case-folding the
data string, anyway. Rather, it does that by converting 'a' or 'A'
in the pattern to the equivalent of '[aA]'. So performance for that
step is not terribly critical.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38

Thomas Munro

thomas.munro@enterprisedb.com

over 9 years ago

In reply to: Peter Eisentraut (#1)

Re: ICU integration

On Wed, Aug 31, 2016 at 2:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Here is a patch I've been working on to allow the use of ICU for sorting
and other locale things.

This is very interesting work, and it's great to see some development
in this area. I've been peripherally involved in more than one
collation-change-broke-my-data incident over the years. I took the
patch for a quick spin today. Here are a couple of initial
observations.

This patch adds pkg-config support to our configure script, in order
to produce the build options for ICU. That's cool, and I'm a fan of
pkg-config, but it's an extra dependency that I just wanted to
highlight. For example MacOSX appears to ship with ICU but has is no
pkg-config or ICU .pc files out of the box (erm, I can't even find the
headers, so maybe that copy of ICU is useless to everyone except
Apple). The MacPorts ICU port ships with .pc files, so that's easy to
deal with, and I don't expect it to be difficult to get a working
pkg-config and meta-data installed alongside ICU on any platform that
doesn't already ship with them. It may well be useful for configuring
other packages. (There is also other cool stuff that would become
possible once ICU is optionally around, off topic.)

There is something missing from the configure script however: the
output of pkg-config is not making it into CFLAGS or LDFLAGS, so
building fails on FreeBSD and MacOSX where for example
<unicode/ucnv.h> doesn't live in the default search path. I tried
very briefly to work out what but autoconfuse defeated me.

You call the built-in strcoll/strxfrm collation provider "posix", and
although POSIX does define those functions, it only does so because it
inherits them from ISO C90. As far as I can tell Windows provides
those functions because it conforms to the C spec, not the POSIX spec,
though I suppose you could argue that in that respect it DOES conform
to the POSIX spec... Also, we already have a collation called
"POSIX". Maybe we should avoid confusing people with a "posix"
provider and a "POSIX" collation? But then I'm not sure what else to
call it, but perhaps "system" as Petr Jelinek suggested[1]/messages/by-id/3113bd83-9b3a-a95b-cf24-4b5b1dc6666f@2ndquadrant.com, or "libc".

postgres=# select * from pg_collation where collname like 'en_NZ%';
┌──────────────────┬───────────────┬───────────┬──────────────┬──────────────┬──────────────────┬──────────────────┬─────────────┐
│ collname │ collnamespace │ collowner │ collprovider │
collencoding │ collcollate │ collctype │ collversion │
├──────────────────┼───────────────┼───────────┼──────────────┼──────────────┼──────────────────┼──────────────────┼─────────────┤
│ en_NZ │ 11 │ 10 │ p │
6 │ en_NZ │ en_NZ │ 0 │
│ en_NZ │ 11 │ 10 │ p │
8 │ en_NZ.ISO8859-1 │ en_NZ.ISO8859-1 │ 0 │
│ en_NZ │ 11 │ 10 │ p │
16 │ en_NZ.ISO8859-15 │ en_NZ.ISO8859-15 │ 0 │
│ en_NZ.ISO8859-1 │ 11 │ 10 │ p │
8 │ en_NZ.ISO8859-1 │ en_NZ.ISO8859-1 │ 0 │
│ en_NZ.ISO8859-15 │ 11 │ 10 │ p │
16 │ en_NZ.ISO8859-15 │ en_NZ.ISO8859-15 │ 0 │
│ en_NZ.UTF-8 │ 11 │ 10 │ p │
6 │ en_NZ.UTF-8 │ en_NZ.UTF-8 │ 0 │
│ en_NZ%icu │ 11 │ 10 │ i │
-1 │ en_NZ │ en_NZ │ -1724383232 │
└──────────────────┴───────────────┴───────────┴──────────────┴──────────────┴──────────────────┴──────────────────┴─────────────┘
(7 rows)

I notice that you use encoding -1, meaning "all". I haven't fully
grokked what that really means but it appears that we won't be able to
create any new such collations after initdb using CREATE COLLATION, if
for example you update your ICU installation and now have a new
collation available. When I tried dropping some and recreating them
they finished up with collencoding = 6. Should the initdb-created
rows use 6 too?

+ ucol_getVersion(collator, versioninfo);
+ numversion = ntohl(*((uint32 *) versioninfo));
+
+ if (numversion != collform->collversion)
+ ereport(WARNING,
+ (errmsg("ICU collator version mismatch"),
+ errdetail("The database was created using version 0x%08X, the
library provides version 0x%08X.",
+   (uint32) collform->collversion, (uint32) numversion),
+ errhint("Rebuild affected indexes, or build PostgreSQL with the
right version of ICU.")));

I played around with bumping version numbers artificially. That gives
each session that accesses the collation one warning:

postgres=# select * from foo order by id;
WARNING: ICU collator version mismatch
DETAIL: The database was created using version 0x99380001, the
library provides version 0x99380000.
HINT: Rebuild affected indexes, or build PostgreSQL with the right
version of ICU.
┌────┐
│ id │
├────┤
└────┘
(0 rows)

postgres=# select * from foo order by id;
┌────┐
│ id │
├────┤
└────┘
(0 rows)

The warning persists even after I reindex all affected tables (and
start a new backend), because you're only recording the collation at
pg_collation level and then only setting it at initdb time, so that
HINT isn't much use for clearing the warning. I think you'd need to
record a snapshot of the version used for each collation that was used
on *each index*, and update it whenever you CREATE INDEX or REINDEX
etc. There can of course be more than one collation and thus version
involved, if you have columns with different collations, so I guess
you'd need an column to hold an array of version snapshots whose order
corresponds to pg_index.indcollation's.

I contemplated something like that for the built-in libc collations,
based on extensions or external scripts that would know how to
checksum the collation definition files for your particular operating
system[2]https://github.com/macdice/check_pg_collations and track them per index (and per column) in a user table.
That's not material for a core feature but it did cause me to think
about this problem a little bit. That was based on the idea that
you'd run a script periodically or at least after OS upgrades, and
either have it run index automatically or email you a set of command
to run off hours.

Obviously you can't refuse to come up on mismatch, or you'll never be
able to REINDEX. Automatically launched unscheduled REINDEXes in core
would be antisocial and never fly. But a warning could easily go
unnoticed leading to corruption (not only reads, but also UNIQUE
CONSTRAINTs not enforced, yada yada yada). I wonder what else we
could reasonably do...

A couple of thoughts about abbreviated keys:

#ifndef TRUST_STRXFRM
if (!collate_c)
abbreviate = false;
#endif

I think this macro should affect only strxfrm, and we should trust
ucol_getSortKey or disable it independently. ICU's manual says
reassuring things like "Sort keys are most useful in databases" and
"Sort keys are generally only useful in databases or other
circumstances where function calls are extremely expensive".

It looks like varstr_abbrev_convert calls strxfrm unconditionally
(assuming TRUST_STRXFRM is defined). <captain-obvious>This needs to
use ucol_getSortKey instead when appropriate.</> It looks like it's a
bit more helpful than strxfrm about telling you the output buffer size
it wants, and it doesn't need nul termination, which is nice.
Unfortunately it is like strxfrm in that the output buffer's contents
is unspecified if it ran out of space.

+++ b/src/test/regress/expected/collate.icu.out
@@ -0,0 +1,1076 @@
+/*
+ * This test is for Linux/glibc systems and assumes that a full set of
+ * locales is installed.  It must be run in a database with UTF-8 encoding,
+ * because other encodings don't support all the characters used.
+ */

Should say ICU.

[1]: /messages/by-id/3113bd83-9b3a-a95b-cf24-4b5b1dc6666f@2ndquadrant.com
[2]: https://github.com/macdice/check_pg_collations

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#39

Peter Geoghegan

pg@heroku.com

over 9 years ago

In reply to: Thomas Munro (#38)

Re: ICU integration

On Fri, Sep 23, 2016 at 7:27 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

A couple of thoughts about abbreviated keys:

#ifndef TRUST_STRXFRM
if (!collate_c)
abbreviate = false;
#endif

I think this macro should affect only strxfrm, and we should trust
ucol_getSortKey or disable it independently. ICU's manual says
reassuring things like "Sort keys are most useful in databases" and
"Sort keys are generally only useful in databases or other
circumstances where function calls are extremely expensive".

+1. Abbreviated keys are essential to get competitive performance
while sorting text, and the fact that ICU makes them safe to
reintroduce is a major advantage of adopting ICU. Perhaps we should
consider wrapping strxfrm() instead, though, so that other existing
callers of strxfrm() (I'm thinking of convert_string_datum()) almost
automatically do the right thing. In other words, maybe there should
be a pg_strxfrm() or something, with TRUST_STRXFRM changed to be
something that can dynamically resolve whether or not it's a collation
managed by a trusted collation provider (this could only be resolved
at runtime, which I think is fine).

It looks like varstr_abbrev_convert calls strxfrm unconditionally
(assuming TRUST_STRXFRM is defined). <captain-obvious>This needs to
use ucol_getSortKey instead when appropriate.</> It looks like it's a
bit more helpful than strxfrm about telling you the output buffer size
it wants, and it doesn't need nul termination, which is nice.
Unfortunately it is like strxfrm in that the output buffer's contents
is unspecified if it ran out of space.

One can use the ucol_nextSortKeyPart() interface to just get the first
4/8 bytes of an abbreviated key, reducing the overhead somewhat, so
the output buffer size limitation is probably irrelevant. The ICU
documentation says something about this being useful for Radix sort,
but I suspect it's more often used to generate abbreviated keys.
Abbreviated keys were not my original idea. They're really just a
standard technique.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#40

Thomas Munro

thomas.munro@enterprisedb.com

over 9 years ago

In reply to: Peter Geoghegan (#39)

Re: ICU integration

On Sat, Sep 24, 2016 at 10:13 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Fri, Sep 23, 2016 at 7:27 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

It looks like varstr_abbrev_convert calls strxfrm unconditionally
(assuming TRUST_STRXFRM is defined). <captain-obvious>This needs to
use ucol_getSortKey instead when appropriate.</> It looks like it's a
bit more helpful than strxfrm about telling you the output buffer size
it wants, and it doesn't need nul termination, which is nice.
Unfortunately it is like strxfrm in that the output buffer's contents
is unspecified if it ran out of space.

One can use the ucol_nextSortKeyPart() interface to just get the first
4/8 bytes of an abbreviated key, reducing the overhead somewhat, so
the output buffer size limitation is probably irrelevant. The ICU
documentation says something about this being useful for Radix sort,
but I suspect it's more often used to generate abbreviated keys.
Abbreviated keys were not my original idea. They're really just a
standard technique.

Nice! The other advantage of ucol_nextSortKeyPart is that you don't have
to convert the whole string to UChar (UTF16) first, as I think you would
need to with ucol_getSortKey, because the UCharIterator mechanism can read
directly from a UTF8 string. I see in the documentation that
ucol_nextSortKeyPart and ucol_getSortKey don't have compatible output, and
this caveat may be related to whether sort key compression is used. I
don't understand what sort of compression is involved but out of curiosity
I asked ICU to spit out some sort keys from ucol_nextSortKeyPart so I could
see their size. As you say, we can ask it to stop at 4 or 8 bytes which is
very convenient for our purposes, but here I asked for more to get the full
output so I could see where the primary weight part ends. The primary
weight took one byte for the Latin letters I tried and two for the Japanese
characters I tried (except 一 which was just 0xaa).

ucol_nextSortKeyPart(en_US, "a", ...) -> 29 01 05 01 05
ucol_nextSortKeyPart(en_US, "ab", ...) -> 29 2b 01 06 01 06
ucol_nextSortKeyPart(en_US, "abc", ...) -> 29 2b 2d 01 07 01 07
ucol_nextSortKeyPart(en_US, "abcd", ...) -> 29 2b 2d 2f 01 08 01 08
ucol_nextSortKeyPart(en_US, "A", ...) -> 29 01 05 01 dc
ucol_nextSortKeyPart(en_US, "AB", ...) -> 29 2b 01 06 01 dc dc
ucol_nextSortKeyPart(en_US, "ABC", ...) -> 29 2b 2d 01 07 01 dc dc dc
ucol_nextSortKeyPart(en_US, "ABCD", ...) -> 29 2b 2d 2f 01 08 01 dc dc dc dc
ucol_nextSortKeyPart(ja_JP, "一", ...) -> aa 01 05 01 05
ucol_nextSortKeyPart(ja_JP, "一二", ...) -> aa d0 0f 01 06 01 06
ucol_nextSortKeyPart(ja_JP, "一二三", ...) -> aa d0 0f cb b8 01 07 01 07
ucol_nextSortKeyPart(ja_JP, "一二三四", ...) -> aa d0 0f cb b8 cb d5 01 08 01 08
ucol_nextSortKeyPart(ja_JP, "日", ...) -> d0 18 01 05 01 05
ucol_nextSortKeyPart(ja_JP, "日本", ...) -> d0 18 d1 d0 01 06 01 06
ucol_nextSortKeyPart(fr_FR, "cote", ...) -> 2d 45 4f 31 01 08 01 08
ucol_nextSortKeyPart(fr_FR, "côte", ...) -> 2d 45 4f 31 01 44 8e 06 01 09
ucol_nextSortKeyPart(fr_FR, "coté", ...) -> 2d 45 4f 31 01 42 88 01 09
ucol_nextSortKeyPart(fr_FR, "côté", ...) -> 2d 45 4f 31 01 44 8e 44 88 01 0a
ucol_nextSortKeyPart(fr_CA, "cote", ...) -> 2d 45 4f 31 01 08 01 08
ucol_nextSortKeyPart(fr_CA, "côte", ...) -> 2d 45 4f 31 01 44 8e 06 01 09
ucol_nextSortKeyPart(fr_CA, "coté", ...) -> 2d 45 4f 31 01 88 08 01 09
ucol_nextSortKeyPart(fr_CA, "côté", ...) -> 2d 45 4f 31 01 88 44 8e 06 01 0a

I wonder how it manages to deal with fr_CA's reversed secondary weighting
rule which requires you to consider diacritics in reverse order --
apparently abandoned in France but still used in Canada -- using a fixed
size space for state between calls.

--
Thomas Munro
http://www.enterprisedb.com

#41

Thomas Munro

thomas.munro@enterprisedb.com

over 9 years ago

In reply to: Thomas Munro (#38)

Re: ICU integration

On Fri, Sep 23, 2016 at 6:27 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

On Wed, Aug 31, 2016 at 2:46 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Here is a patch I've been working on to allow the use of ICU for sorting
and other locale things.

This is very interesting work, and it's great to see some development
in this area. I've been peripherally involved in more than one
collation-change-broke-my-data incident over the years. I took the
patch for a quick spin today. Here are a couple of initial
observations.

This seems like a solid start, but there are unresolved questions
about both high level goals (versioning strategy etc) and also some
technical details with this WIP patch. It looks like several people
have an interest and ideas in this area, but clearly there isn't going
to be a committable patch in the next 48 hours. So I will set this to
'Returned with Feedback' for now. If you think you'll have a new
patch for the next CF then it looks like you can still 'Move to Next
CF' from 'Returned with Feedback' state if appropriate. Thanks!

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#42

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 9 years ago

In reply to: Thomas Munro (#38)

Re: ICU integration

On 9/23/16 2:27 AM, Thomas Munro wrote:

This patch adds pkg-config support to our configure script, in order
to produce the build options for ICU. That's cool, and I'm a fan of
pkg-config, but it's an extra dependency that I just wanted to
highlight. For example MacOSX appears to ship with ICU but has is no
pkg-config or ICU .pc files out of the box (erm, I can't even find the
headers, so maybe that copy of ICU is useless to everyone except
Apple).

The Homebrew package icu4c contains this note:

OS X provides libicucore.dylib (but nothing else).

That probably explains what you are seeing.

There is something missing from the configure script however: the
output of pkg-config is not making it into CFLAGS or LDFLAGS, so
building fails on FreeBSD and MacOSX where for example
<unicode/ucnv.h> doesn't live in the default search path.

It sets ICU_CFLAGS and ICU_LIBS, but it seems I didn't put ICU_CFLAGS in
the backend makefiles. So that should be easy to fix.

You call the built-in strcoll/strxfrm collation provider "posix", and
although POSIX does define those functions, it only does so because it
inherits them from ISO C90.

POSIX defines strcoll_l() and such. But I agree SYSTEM might be better.

I notice that you use encoding -1, meaning "all".

The use of encoding -1 is existing behavior.

I haven't fully
grokked what that really means but it appears that we won't be able to
create any new such collations after initdb using CREATE COLLATION, if
for example you update your ICU installation and now have a new
collation available. When I tried dropping some and recreating them
they finished up with collencoding = 6. Should the initdb-created
rows use 6 too?

Good observation. That will need some fine-tuning.

The warning persists even after I reindex all affected tables (and
start a new backend), because you're only recording the collation at
pg_collation level and then only setting it at initdb time, so that
HINT isn't much use for clearing the warning. I think you'd need to
record a snapshot of the version used for each collation that was used
on *each index*, and update it whenever you CREATE INDEX or REINDEX
etc. There can of course be more than one collation and thus version
involved, if you have columns with different collations, so I guess
you'd need an column to hold an array of version snapshots whose order
corresponds to pg_index.indcollation's.

Hmm, yeah, that will need more work. To be completely correct, I think,
we'd also need to record the version in each expression node, so that
check constraints of the form CHECK (x > 'abc') can be handled.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43

Thomas Munro

thomas.munro@enterprisedb.com

over 9 years ago

In reply to: Peter Eisentraut (#42)

Re: ICU integration

On Sat, Oct 1, 2016 at 3:30 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 9/23/16 2:27 AM, Thomas Munro wrote:

The warning persists even after I reindex all affected tables (and
start a new backend), because you're only recording the collation at
pg_collation level and then only setting it at initdb time, so that
HINT isn't much use for clearing the warning. I think you'd need to
record a snapshot of the version used for each collation that was used
on *each index*, and update it whenever you CREATE INDEX or REINDEX
etc. There can of course be more than one collation and thus version
involved, if you have columns with different collations, so I guess
you'd need an column to hold an array of version snapshots whose order
corresponds to pg_index.indcollation's.

Hmm, yeah, that will need more work. To be completely correct, I think,
we'd also need to record the version in each expression node, so that
check constraints of the form CHECK (x > 'abc') can be handled.

Hmm. That is quite a rabbit hole. In theory you need to recheck such
a constraint, but it's not at all clear when you should recheck and
what you should do about it if it fails. Similar for the future
PARTITION feature.

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#44

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 9 years ago

In reply to: Thomas Munro (#43)

Re: ICU integration

On 9/30/16 4:32 PM, Thomas Munro wrote:

Hmm, yeah, that will need more work. To be completely correct, I think,

we'd also need to record the version in each expression node, so that
check constraints of the form CHECK (x > 'abc') can be handled.

Hmm. That is quite a rabbit hole. In theory you need to recheck such
a constraint, but it's not at all clear when you should recheck and
what you should do about it if it fails. Similar for the future
PARTITION feature.

I think it's not worth dealing with this in that much detail at the
moment. It's not like the collation will just randomly change under you
(unlike with glibc). It would have to involve pg_upgrade, physical
replication, or a rebuilt installation. So I think I will change the
message to something to the effect of "however you got here, you can't
do that". We can develop some recipes and ideas on the side for how to
recover situations like that and then maybe integrate tooling for that
later.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#45

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

about 9 years ago

In reply to: Peter Eisentraut (#1)

1 attachment(s)

Re: ICU integration

Updated patch attached.

The previous round of reviews showed that there was general agreement on
the approach. So I have focused on filling in the gaps, added ICU
support to all the locale-using places, added documentation, fixed some
minor issues that have been pointed out. Everything appears to work
correctly now, and the user-facing feature set is pretty well-rounded.

I don't have much experience with the abbreviated key stuff. I have
filled in what I think should work, but it needs detailed review.

Similarly, some of the stuff in the regular expression code was hacked
in blindly.

One minor problem is that we now support adding any-encoding collation
entries, which violates some of the comments in CollationCreate(). See
FIXME there. It doesn't seem worth major contortions to fix that; maybe
it just has to be documented better.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v2-0001-ICU-support.patchtext/x-patch; name=v2-0001-ICU-support.patchDownload

From 115c8f901c5bcd7463a4b01b53d41144bcbd6d7a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Tue, 27 Dec 2016 12:00:00 -0500
Subject: [PATCH v2] ICU support

Add a column collprovider to pg_collation that determines which library
provides the collation data.  The existing choices are default and libc,
and this adds an icu choice, which uses the ICU4C library.

The pg_locale_t type is changed to a struct that contains the
provider-specific locale handles.  Users of locale information are
changed to look into that struct for the appropriate handle to use.

Also add a collversion column that records the version of the collation
when it is created, and check at run time whether it is still the same.
This detects potentially incompatible library upgrades that can corrupt
indexes and other structures.  This is currently only supported by
ICU-provided collations.

initdb initializes the default collation set as before from the `locale
-a` output but also adds all available ICU locales with a "%icu"
appended.

Currently, ICU-provided collations can only be explicitly named
collations.  The global database locales are still always libc-provided.
---
 aclocal.m4                                |    1 +
 config/pkg.m4                             |  275 +++++++
 configure                                 |  313 ++++++++
 configure.in                              |   35 +
 doc/src/sgml/catalogs.sgml                |   19 +
 doc/src/sgml/charset.sgml                 |   43 +-
 doc/src/sgml/installation.sgml            |   14 +
 doc/src/sgml/ref/alter_collation.sgml     |   42 ++
 doc/src/sgml/ref/create_collation.sgml    |   16 +-
 src/Makefile.global.in                    |    4 +
 src/backend/Makefile                      |    2 +-
 src/backend/catalog/pg_collation.c        |   15 +-
 src/backend/commands/collationcmds.c      |  132 +++-
 src/backend/common.mk                     |    2 +
 src/backend/nodes/copyfuncs.c             |   13 +
 src/backend/nodes/equalfuncs.c            |   11 +
 src/backend/parser/gram.y                 |   18 +-
 src/backend/regex/regc_pg_locale.c        |  110 ++-
 src/backend/tcop/utility.c                |    8 +
 src/backend/utils/adt/formatting.c        |  453 ++++++------
 src/backend/utils/adt/like.c              |   53 +-
 src/backend/utils/adt/pg_locale.c         |  160 ++++-
 src/backend/utils/adt/selfuncs.c          |    6 +-
 src/backend/utils/adt/varlena.c           |  108 ++-
 src/bin/initdb/Makefile                   |    4 +-
 src/bin/initdb/initdb.c                   |   50 +-
 src/bin/pg_dump/pg_dump.c                 |   55 +-
 src/include/catalog/pg_collation.h        |   23 +-
 src/include/catalog/pg_collation_fn.h     |    4 +-
 src/include/commands/collationcmds.h      |    1 +
 src/include/nodes/nodes.h                 |    1 +
 src/include/nodes/parsenodes.h            |   11 +
 src/include/pg_config.h.in                |    6 +
 src/include/utils/pg_locale.h             |   27 +-
 src/test/regress/GNUmakefile              |    3 +
 src/test/regress/expected/collate.icu.out | 1107 +++++++++++++++++++++++++++++
 src/test/regress/sql/collate.icu.sql      |  420 +++++++++++
 37 files changed, 3209 insertions(+), 356 deletions(-)
 create mode 100644 config/pkg.m4
 create mode 100644 src/test/regress/expected/collate.icu.out
 create mode 100644 src/test/regress/sql/collate.icu.sql

diff --git a/aclocal.m4 b/aclocal.m4
index 6f930b6fc1..5ca902b6a2 100644
--- a/aclocal.m4
+++ b/aclocal.m4
@@ -7,6 +7,7 @@ m4_include([config/docbook.m4])
 m4_include([config/general.m4])
 m4_include([config/libtool.m4])
 m4_include([config/perl.m4])
+m4_include([config/pkg.m4])
 m4_include([config/programs.m4])
 m4_include([config/python.m4])
 m4_include([config/tcl.m4])
diff --git a/config/pkg.m4 b/config/pkg.m4
new file mode 100644
index 0000000000..82bea96ee7
--- /dev/null
+++ b/config/pkg.m4
@@ -0,0 +1,275 @@
+dnl pkg.m4 - Macros to locate and utilise pkg-config.   -*- Autoconf -*-
+dnl serial 11 (pkg-config-0.29.1)
+dnl
+dnl Copyright © 2004 Scott James Remnant <scott@netsplit.com>.
+dnl Copyright © 2012-2015 Dan Nicholson <dbn.lists@gmail.com>
+dnl
+dnl This program is free software; you can redistribute it and/or modify
+dnl it under the terms of the GNU General Public License as published by
+dnl the Free Software Foundation; either version 2 of the License, or
+dnl (at your option) any later version.
+dnl
+dnl This program is distributed in the hope that it will be useful, but
+dnl WITHOUT ANY WARRANTY; without even the implied warranty of
+dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+dnl General Public License for more details.
+dnl
+dnl You should have received a copy of the GNU General Public License
+dnl along with this program; if not, write to the Free Software
+dnl Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+dnl 02111-1307, USA.
+dnl
+dnl As a special exception to the GNU General Public License, if you
+dnl distribute this file as part of a program that contains a
+dnl configuration script generated by Autoconf, you may include it under
+dnl the same distribution terms that you use for the rest of that
+dnl program.
+
+dnl PKG_PREREQ(MIN-VERSION)
+dnl -----------------------
+dnl Since: 0.29
+dnl
+dnl Verify that the version of the pkg-config macros are at least
+dnl MIN-VERSION. Unlike PKG_PROG_PKG_CONFIG, which checks the user's
+dnl installed version of pkg-config, this checks the developer's version
+dnl of pkg.m4 when generating configure.
+dnl
+dnl To ensure that this macro is defined, also add:
+dnl m4_ifndef([PKG_PREREQ],
+dnl     [m4_fatal([must install pkg-config 0.29 or later before running autoconf/autogen])])
+dnl
+dnl See the "Since" comment for each macro you use to see what version
+dnl of the macros you require.
+m4_defun([PKG_PREREQ],
+[m4_define([PKG_MACROS_VERSION], [0.29.1])
+m4_if(m4_version_compare(PKG_MACROS_VERSION, [$1]), -1,
+    [m4_fatal([pkg.m4 version $1 or higher is required but ]PKG_MACROS_VERSION[ found])])
+])dnl PKG_PREREQ
+
+dnl PKG_PROG_PKG_CONFIG([MIN-VERSION])
+dnl ----------------------------------
+dnl Since: 0.16
+dnl
+dnl Search for the pkg-config tool and set the PKG_CONFIG variable to
+dnl first found in the path. Checks that the version of pkg-config found
+dnl is at least MIN-VERSION. If MIN-VERSION is not specified, 0.9.0 is
+dnl used since that's the first version where most current features of
+dnl pkg-config existed.
+AC_DEFUN([PKG_PROG_PKG_CONFIG],
+[m4_pattern_forbid([^_?PKG_[A-Z_]+$])
+m4_pattern_allow([^PKG_CONFIG(_(PATH|LIBDIR|SYSROOT_DIR|ALLOW_SYSTEM_(CFLAGS|LIBS)))?$])
+m4_pattern_allow([^PKG_CONFIG_(DISABLE_UNINSTALLED|TOP_BUILD_DIR|DEBUG_SPEW)$])
+AC_ARG_VAR([PKG_CONFIG], [path to pkg-config utility])
+AC_ARG_VAR([PKG_CONFIG_PATH], [directories to add to pkg-config's search path])
+AC_ARG_VAR([PKG_CONFIG_LIBDIR], [path overriding pkg-config's built-in search path])
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	AC_PATH_TOOL([PKG_CONFIG], [pkg-config])
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=m4_default([$1], [0.9.0])
+	AC_MSG_CHECKING([pkg-config is at least version $_pkg_min_version])
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		AC_MSG_RESULT([yes])
+	else
+		AC_MSG_RESULT([no])
+		PKG_CONFIG=""
+	fi
+fi[]dnl
+])dnl PKG_PROG_PKG_CONFIG
+
+dnl PKG_CHECK_EXISTS(MODULES, [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------------------------------
+dnl Since: 0.18
+dnl
+dnl Check to see whether a particular set of modules exists. Similar to
+dnl PKG_CHECK_MODULES(), but does not set variables or print errors.
+dnl
+dnl Please remember that m4 expands AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+dnl only at the first occurence in configure.ac, so if the first place
+dnl it's called might be skipped (such as if it is within an "if", you
+dnl have to call PKG_CHECK_EXISTS manually
+AC_DEFUN([PKG_CHECK_EXISTS],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+if test -n "$PKG_CONFIG" && \
+    AC_RUN_LOG([$PKG_CONFIG --exists --print-errors "$1"]); then
+  m4_default([$2], [:])
+m4_ifvaln([$3], [else
+  $3])dnl
+fi])
+
+dnl _PKG_CONFIG([VARIABLE], [COMMAND], [MODULES])
+dnl ---------------------------------------------
+dnl Internal wrapper calling pkg-config via PKG_CONFIG and setting
+dnl pkg_failed based on the result.
+m4_define([_PKG_CONFIG],
+[if test -n "$$1"; then
+    pkg_cv_[]$1="$$1"
+ elif test -n "$PKG_CONFIG"; then
+    PKG_CHECK_EXISTS([$3],
+                     [pkg_cv_[]$1=`$PKG_CONFIG --[]$2 "$3" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes ],
+		     [pkg_failed=yes])
+ else
+    pkg_failed=untried
+fi[]dnl
+])dnl _PKG_CONFIG
+
+dnl _PKG_SHORT_ERRORS_SUPPORTED
+dnl ---------------------------
+dnl Internal check to see if pkg-config supports short errors.
+AC_DEFUN([_PKG_SHORT_ERRORS_SUPPORTED],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi[]dnl
+])dnl _PKG_SHORT_ERRORS_SUPPORTED
+
+
+dnl PKG_CHECK_MODULES(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl --------------------------------------------------------------
+dnl Since: 0.4.0
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES might not happen, you should be sure to include an
+dnl explicit call to PKG_PROG_PKG_CONFIG in your configure.ac
+AC_DEFUN([PKG_CHECK_MODULES],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1][_CFLAGS], [C compiler flags for $1, overriding pkg-config])dnl
+AC_ARG_VAR([$1][_LIBS], [linker flags for $1, overriding pkg-config])dnl
+
+pkg_failed=no
+AC_MSG_CHECKING([for $1])
+
+_PKG_CONFIG([$1][_CFLAGS], [cflags], [$2])
+_PKG_CONFIG([$1][_LIBS], [libs], [$2])
+
+m4_define([_PKG_TEXT], [Alternatively, you may set the environment variables $1[]_CFLAGS
+and $1[]_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.])
+
+if test $pkg_failed = yes; then
+   	AC_MSG_RESULT([no])
+        _PKG_SHORT_ERRORS_SUPPORTED
+        if test $_pkg_short_errors_supported = yes; then
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "$2" 2>&1`
+        else 
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "$2" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$$1[]_PKG_ERRORS" >&AS_MESSAGE_LOG_FD
+
+	m4_default([$4], [AC_MSG_ERROR(
+[Package requirements ($2) were not met:
+
+$$1_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+_PKG_TEXT])[]dnl
+        ])
+elif test $pkg_failed = untried; then
+     	AC_MSG_RESULT([no])
+	m4_default([$4], [AC_MSG_FAILURE(
+[The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+_PKG_TEXT
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.])[]dnl
+        ])
+else
+	$1[]_CFLAGS=$pkg_cv_[]$1[]_CFLAGS
+	$1[]_LIBS=$pkg_cv_[]$1[]_LIBS
+        AC_MSG_RESULT([yes])
+	$3
+fi[]dnl
+])dnl PKG_CHECK_MODULES
+
+
+dnl PKG_CHECK_MODULES_STATIC(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl ---------------------------------------------------------------------
+dnl Since: 0.29
+dnl
+dnl Checks for existence of MODULES and gathers its build flags with
+dnl static libraries enabled. Sets VARIABLE-PREFIX_CFLAGS from --cflags
+dnl and VARIABLE-PREFIX_LIBS from --libs.
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES_STATIC might not happen, you should be sure to
+dnl include an explicit call to PKG_PROG_PKG_CONFIG in your
+dnl configure.ac.
+AC_DEFUN([PKG_CHECK_MODULES_STATIC],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+_save_PKG_CONFIG=$PKG_CONFIG
+PKG_CONFIG="$PKG_CONFIG --static"
+PKG_CHECK_MODULES($@)
+PKG_CONFIG=$_save_PKG_CONFIG[]dnl
+])dnl PKG_CHECK_MODULES_STATIC
+
+
+dnl PKG_INSTALLDIR([DIRECTORY])
+dnl -------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable pkgconfigdir as the location where a module
+dnl should install pkg-config .pc files. By default the directory is
+dnl $libdir/pkgconfig, but the default can be changed by passing
+dnl DIRECTORY. The user can override through the --with-pkgconfigdir
+dnl parameter.
+AC_DEFUN([PKG_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${libdir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([pkgconfigdir],
+    [AS_HELP_STRING([--with-pkgconfigdir], pkg_description)],,
+    [with_pkgconfigdir=]pkg_default)
+AC_SUBST([pkgconfigdir], [$with_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_INSTALLDIR
+
+
+dnl PKG_NOARCH_INSTALLDIR([DIRECTORY])
+dnl --------------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable noarch_pkgconfigdir as the location where a
+dnl module should install arch-independent pkg-config .pc files. By
+dnl default the directory is $datadir/pkgconfig, but the default can be
+dnl changed by passing DIRECTORY. The user can override through the
+dnl --with-noarch-pkgconfigdir parameter.
+AC_DEFUN([PKG_NOARCH_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${datadir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config arch-independent installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([noarch-pkgconfigdir],
+    [AS_HELP_STRING([--with-noarch-pkgconfigdir], pkg_description)],,
+    [with_noarch_pkgconfigdir=]pkg_default)
+AC_SUBST([noarch_pkgconfigdir], [$with_noarch_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_NOARCH_INSTALLDIR
+
+
+dnl PKG_CHECK_VAR(VARIABLE, MODULE, CONFIG-VARIABLE,
+dnl [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------
+dnl Since: 0.28
+dnl
+dnl Retrieves the value of the pkg-config variable for the given module.
+AC_DEFUN([PKG_CHECK_VAR],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1], [value of $3 for $2, overriding pkg-config])dnl
+
+_PKG_CONFIG([$1], [variable="][$3]["], [$2])
+AS_VAR_COPY([$1], [pkg_cv_][$1])
+
+AS_VAR_IF([$1], [""], [$5], [$4])dnl
+])dnl PKG_CHECK_VAR
diff --git a/configure b/configure
index 0f143a0fad..d0099c9fe9 100755
--- a/configure
+++ b/configure
@@ -715,6 +715,12 @@ krb_srvtab
 with_python
 with_perl
 with_tcl
+ICU_LIBS
+ICU_CFLAGS
+PKG_CONFIG_LIBDIR
+PKG_CONFIG_PATH
+PKG_CONFIG
+with_icu
 enable_thread_safety
 INCLUDES
 autodepend
@@ -821,6 +827,7 @@ with_CC
 enable_depend
 enable_cassert
 enable_thread_safety
+with_icu
 with_tcl
 with_tclconfig
 with_perl
@@ -856,6 +863,11 @@ LDFLAGS
 LIBS
 CPPFLAGS
 CPP
+PKG_CONFIG
+PKG_CONFIG_PATH
+PKG_CONFIG_LIBDIR
+ICU_CFLAGS
+ICU_LIBS
 LDFLAGS_EX
 LDFLAGS_SL
 DOCBOOKSTYLE'
@@ -1511,6 +1523,7 @@ Optional Packages:
   --with-wal-segsize=SEGSIZE
                           set WAL segment size in MB [16]
   --with-CC=CMD           set compiler (deprecated)
+  --with-icu              build with ICU support
   --with-tcl              build Tcl modules (PL/Tcl)
   --with-tclconfig=DIR    tclConfig.sh is in DIR
   --with-perl             build Perl modules (PL/Perl)
@@ -1546,6 +1559,13 @@ Some influential environment variables:
   CPPFLAGS    (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if
               you have headers in a nonstandard directory <include dir>
   CPP         C preprocessor
+  PKG_CONFIG  path to pkg-config utility
+  PKG_CONFIG_PATH
+              directories to add to pkg-config's search path
+  PKG_CONFIG_LIBDIR
+              path overriding pkg-config's built-in search path
+  ICU_CFLAGS  C compiler flags for ICU, overriding pkg-config
+  ICU_LIBS    linker flags for ICU, overriding pkg-config
   LDFLAGS_EX  extra linker flags for linking executables only
   LDFLAGS_SL  extra linker flags for linking shared libraries only
   DOCBOOKSTYLE
@@ -5368,6 +5388,255 @@ $as_echo "$enable_thread_safety" >&6; }
 
 
 #
+# ICU
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with ICU support" >&5
+$as_echo_n "checking whether to build with ICU support... " >&6; }
+
+
+
+# Check whether --with-icu was given.
+if test "${with_icu+set}" = set; then :
+  withval=$with_icu;
+  case $withval in
+    yes)
+
+$as_echo "#define USE_ICU 1" >>confdefs.h
+
+      ;;
+    no)
+      :
+      ;;
+    *)
+      as_fn_error $? "no argument expected for --with-icu option" "$LINENO" 5
+      ;;
+  esac
+
+else
+  with_icu=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_icu" >&5
+$as_echo "$with_icu" >&6; }
+
+
+if test "$with_icu" = yes; then
+
+
+
+
+
+
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	if test -n "$ac_tool_prefix"; then
+  # Extract the first word of "${ac_tool_prefix}pkg-config", so it can be a program name with args.
+set dummy ${ac_tool_prefix}pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_PKG_CONFIG="$PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+PKG_CONFIG=$ac_cv_path_PKG_CONFIG
+if test -n "$PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $PKG_CONFIG" >&5
+$as_echo "$PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+fi
+if test -z "$ac_cv_path_PKG_CONFIG"; then
+  ac_pt_PKG_CONFIG=$PKG_CONFIG
+  # Extract the first word of "pkg-config", so it can be a program name with args.
+set dummy pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_ac_pt_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $ac_pt_PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_ac_pt_PKG_CONFIG="$ac_pt_PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_ac_pt_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+ac_pt_PKG_CONFIG=$ac_cv_path_ac_pt_PKG_CONFIG
+if test -n "$ac_pt_PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_pt_PKG_CONFIG" >&5
+$as_echo "$ac_pt_PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+  if test "x$ac_pt_PKG_CONFIG" = x; then
+    PKG_CONFIG=""
+  else
+    case $cross_compiling:$ac_tool_warned in
+yes:)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5
+$as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;}
+ac_tool_warned=yes ;;
+esac
+    PKG_CONFIG=$ac_pt_PKG_CONFIG
+  fi
+else
+  PKG_CONFIG="$ac_cv_path_PKG_CONFIG"
+fi
+
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=0.9.0
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: checking pkg-config is at least version $_pkg_min_version" >&5
+$as_echo_n "checking pkg-config is at least version $_pkg_min_version... " >&6; }
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+	else
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+		PKG_CONFIG=""
+	fi
+fi
+
+pkg_failed=no
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for ICU" >&5
+$as_echo_n "checking for ICU... " >&6; }
+
+if test -n "$ICU_CFLAGS"; then
+    pkg_cv_ICU_CFLAGS="$ICU_CFLAGS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_CFLAGS=`$PKG_CONFIG --cflags "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+if test -n "$ICU_LIBS"; then
+    pkg_cv_ICU_LIBS="$ICU_LIBS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_LIBS=`$PKG_CONFIG --libs "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+
+
+
+if test $pkg_failed = yes; then
+   	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi
+        if test $_pkg_short_errors_supported = yes; then
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        else
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$ICU_PKG_ERRORS" >&5
+
+	as_fn_error $? "Package requirements (icu-uc icu-i18n) were not met:
+
+$ICU_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details." "$LINENO" 5
+elif test $pkg_failed = untried; then
+     	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+	{ { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error $? "The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.
+See \`config.log' for more details" "$LINENO" 5; }
+else
+	ICU_CFLAGS=$pkg_cv_ICU_CFLAGS
+	ICU_LIBS=$pkg_cv_ICU_LIBS
+        { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+
+fi
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with Tcl" >&5
@@ -13405,6 +13674,50 @@ fi
 done
 
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ucol_strcollUTF8" >&5
+$as_echo_n "checking for ucol_strcollUTF8... " >&6; }
+if ${pgac_cv_func_ucol_strcollUTF8+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <unicode/ucol.h>
+
+int
+main ()
+{
+ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv_func_ucol_strcollUTF8=yes
+else
+  pgac_cv_func_ucol_strcollUTF8=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_func_ucol_strcollUTF8" >&5
+$as_echo "$pgac_cv_func_ucol_strcollUTF8" >&6; }
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+
+$as_echo "#define HAVE_UCOL_STRCOLLUTF8 1" >>confdefs.h
+
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/configure.in b/configure.in
index b9831bc340..5ee9c370c0 100644
--- a/configure.in
+++ b/configure.in
@@ -614,6 +614,19 @@ AC_MSG_RESULT([$enable_thread_safety])
 AC_SUBST(enable_thread_safety)
 
 #
+# ICU
+#
+AC_MSG_CHECKING([whether to build with ICU support])
+PGAC_ARG_BOOL(with, icu, no, [build with ICU support],
+              [AC_DEFINE([USE_ICU], 1, [Define to build with ICU support. (--with-icu)])])
+AC_MSG_RESULT([$with_icu])
+AC_SUBST(with_icu)
+
+if test "$with_icu" = yes; then
+  PKG_CHECK_MODULES(ICU, icu-uc icu-i18n)
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 AC_MSG_CHECKING([whether to build with Tcl])
@@ -1639,6 +1652,28 @@ fi
 AC_CHECK_FUNCS([strtoll strtoq], [break])
 AC_CHECK_FUNCS([strtoull strtouq], [break])
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  AC_CACHE_CHECK([for ucol_strcollUTF8], [pgac_cv_func_ucol_strcollUTF8],
+[ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+AC_LINK_IFELSE([AC_LANG_PROGRAM(
+[#include <unicode/ucol.h>
+],
+[ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);])],
+[pgac_cv_func_ucol_strcollUTF8=yes],
+[pgac_cv_func_ucol_strcollUTF8=no])
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS])
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+    AC_DEFINE([HAVE_UCOL_STRCOLLUTF8], 1, [Define to 1 if you have the `ucol_strcollUTF8' function.])
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 493050618d..a5a5b19695 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1989,6 +1989,14 @@ <title><structname>pg_collation</> Columns</title>
      </row>
 
      <row>
+      <entry><structfield>collprovider</structfield></entry>
+      <entry><type>char</type></entry>
+      <entry></entry>
+      <entry>Provider of the collation: <literal>d</literal> = database
+       default, <literal>c</literal> = libc, <literal>i</literal> = icu</entry>
+     </row>
+
+     <row>
       <entry><structfield>collencoding</structfield></entry>
       <entry><type>int4</type></entry>
       <entry></entry>
@@ -2009,6 +2017,17 @@ <title><structname>pg_collation</> Columns</title>
       <entry></entry>
       <entry><symbol>LC_CTYPE</> for this collation object</entry>
      </row>
+
+     <row>
+      <entry><structfield>collversion</structfield></entry>
+      <entry><type>int4</type></entry>
+      <entry></entry>
+      <entry>
+       Provider-specific version of the collation.  This is recorded when the
+       collation is created and then checked when it is used, to detect
+       changes in the collation definition that could lead to data corruption.
+      </entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index f8c7ac3b16..6e3fee150a 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -500,16 +500,28 @@ <title>Concepts</title>
    <title>Managing Collations</title>
 
    <para>
-    A collation is an SQL schema object that maps an SQL name to
-    operating system locales.  In particular, it maps to a combination
-    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>.  (As
+    A collation is an SQL schema object that maps an SQL name to locales
+    provided by libraries installed in the operating system.  A collation
+    definition has a <firstterm>provider</firstterm> that specifies which
+    library supplies the locale data.  One standard provider name
+    is <literal>libc</literal>, which uses the locales provided by the
+    operating system C library.  These are the locales that most tools
+    provided by the operating system use.  Another provider
+    is <literal>icu</literal>, which uses the external
+    ICU<indexterm><primary>ICU</></> library.  Support for ICU has to be
+    configured when PostgreSQL is built.
+   </para>
+
+   <para>
+    A collation object provided by <literal>libc</literal> maps to a combination
+    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol> settings.  (As
     the name would suggest, the main purpose of a collation is to set
     <symbol>LC_COLLATE</symbol>, which controls the sort order.  But
     it is rarely necessary in practice to have an
     <symbol>LC_CTYPE</symbol> setting that is different from
     <symbol>LC_COLLATE</symbol>, so it is more convenient to collect
     these under one concept than to create another infrastructure for
-    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a collation
+    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a <literal>libc</literal> collation
     is tied to a character set encoding (see <xref linkend="multibyte">).
     The same collation name may exist for different encodings.
    </para>
@@ -528,8 +540,18 @@ <title>Managing Collations</title>
    </para>
 
    <para>
+    A collation provided by <literal>icu</literal> maps to a named collator
+    provided by the ICU library.  ICU does not support
+    separate <quote>collate</quote> and <quote>ctype</quote> settings, so they
+    are always the same.  Also, ICU collations are independent of the
+    encoding, so there is always only one ICU collation for a given name in a
+    database.
+   </para>
+
+   <para>
     If the operating system provides support for using multiple locales
     within a single program (<function>newlocale</> and related functions),
+    or support for ICU is configured,
     then when a database cluster is initialized, <command>initdb</command>
     populates the system catalog <literal>pg_collation</literal> with
     collations based on all the locales it finds on the operating
@@ -544,17 +566,20 @@ <title>Managing Collations</title>
     under the name <literal>de_DE</literal>, which is less cumbersome
     to write and makes the name less encoding-dependent.  Note that,
     nevertheless, the initial set of collation names is
-    platform-dependent.
+    platform-dependent.  Collations provided by ICU are created
+    with <literal>%icu</literal> appended, so <literal>de_DE%icu</literal> for
+    example.
    </para>
 
    <para>
-    In case a collation is needed that has different values for
+    In case a <literal>libc</literal> collation is needed that has different values for
     <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>, a new
     collation may be created using
     the <xref linkend="sql-createcollation"> command.  That command
     can also be used to create a new collation from an existing
     collation, which can be useful to be able to use
-    operating-system-independent collation names in applications.
+    operating-system-independent collation names in applications or use an
+    ICU-provided collation under a suffix-less name.
    </para>
 
    <para>
@@ -566,8 +591,8 @@ <title>Managing Collations</title>
     Use of the stripped collation names is recommended, since it will
     make one less thing you need to change if you decide to change to
     another database encoding.  Note however that the <literal>default</>,
-    <literal>C</>, and <literal>POSIX</> collations can be used
-    regardless of the database encoding.
+    <literal>C</>, and <literal>POSIX</> collations, as well as all collations
+    provided by ICU can be used regardless of the database encoding.
    </para>
 
    <para>
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 4431ed75a9..5c96cba51a 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -771,6 +771,20 @@ <title>Configuration</title>
       </varlistentry>
 
       <varlistentry>
+       <term><option>--with-icu</option></term>
+       <listitem>
+        <para>
+         Build with support for
+         the <productname>ICU</productname><indexterm><primary>ICU</></>
+         library.  This requires the <productname>ICU4C</productname> package
+         as well
+         as <productname>pkg-config</productname><indexterm><primary>pkg-config</></>
+         to be installed.
+        </para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
        <term><option>--with-openssl</option>
        <indexterm>
         <primary>OpenSSL</primary>
diff --git a/doc/src/sgml/ref/alter_collation.sgml b/doc/src/sgml/ref/alter_collation.sgml
index 6708c7e10e..c218539c1b 100644
--- a/doc/src/sgml/ref/alter_collation.sgml
+++ b/doc/src/sgml/ref/alter_collation.sgml
@@ -21,6 +21,8 @@
 
  <refsynopsisdiv>
 <synopsis>
+ALTER COLLATION <replaceable>name</replaceable> REFRESH VERSION
+
 ALTER COLLATION <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
 ALTER COLLATION <replaceable>name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
 ALTER COLLATION <replaceable>name</replaceable> SET SCHEMA <replaceable>new_schema</replaceable>
@@ -85,9 +87,49 @@ <title>Parameters</title>
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term><literal>REFRESH VERSION</literal></term>
+    <listitem>
+     <para>
+      Updated the collation version.
+      See <xref linkend="sql-altercollation-notes"> below.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </refsect1>
 
+ <refsect1 id="sql-altercollation-notes">
+  <title>Notes</title>
+
+  <para>
+   When using collations provided by the ICU library, the ICU-specific version
+   of the collator is recorded in the system catalog when the collation object
+   is created.  When the collation is then used, the current version is
+   checked against the recorded version, and a warning is issued when there is
+   a mismatch, for example:
+<screen>
+WARNING:  ICU collator version mismatch
+DETAIL:  The database was created using version 0x00000005, the library provides version 0x00000006.
+HINT:  Rebuild all objects affected by this collation and run ALTER COLLATION pg_catalog."xx%icu" REFRESH VERSION, or build PostgreSQL with the right version of ICU.
+</screen>
+   A change in collation definitions can lead to corrupt indexes and other
+   problems where the database system relies on stored objects having a
+   certain sort order.  Generally, this should be avoided, but it can happen
+   in legitimate circumstances, such as when
+   using <command>pg_upgrade</command> to upgrade to server binaries linked
+   with a newer version of ICU.  When this happens, all objects depending on
+   the collation should be rebuilt, for example,
+   using <command>REINDEX</command>.  When that is done, the collation version
+   can be refreshed using the command <literal>ALTER COLLATION ... REFRESH
+   VERSION</literal>.  This will update the system catalog to record the
+   current collator version and will make the warning go away.  Note that this
+   does not actually check whether all affected objects have been rebuilt
+   correctly.
+  </para>
+ </refsect1>
+
  <refsect1>
   <title>Examples</title>
 
diff --git a/doc/src/sgml/ref/create_collation.sgml b/doc/src/sgml/ref/create_collation.sgml
index d757cdfb43..b958f26fab 100644
--- a/doc/src/sgml/ref/create_collation.sgml
+++ b/doc/src/sgml/ref/create_collation.sgml
@@ -21,7 +21,8 @@
 CREATE COLLATION <replaceable>name</replaceable> (
     [ LOCALE = <replaceable>locale</replaceable>, ]
     [ LC_COLLATE = <replaceable>lc_collate</replaceable>, ]
-    [ LC_CTYPE = <replaceable>lc_ctype</replaceable> ]
+    [ LC_CTYPE = <replaceable>lc_ctype</replaceable>, ]
+    [ PROVIDER = <replaceable>provider</replaceable> ]
 )
 CREATE COLLATION <replaceable>name</replaceable> FROM <replaceable>existing_collation</replaceable>
 </synopsis>
@@ -103,6 +104,19 @@ <title>Parameters</title>
     </varlistentry>
 
     <varlistentry>
+     <term><replaceable>provider</replaceable></term>
+
+     <listitem>
+      <para>
+       Specifies the provider to use for locale services associated with this
+       collation.  Possible values
+       are: <literal>icu</literal>,<indexterm><primary>ICU</></> <literal>libc</literal>.
+       The available choices depend on the operating system and build options.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
      <term><replaceable>existing_collation</replaceable></term>
 
      <listitem>
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index d39d6ca867..933c3ff05c 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -179,6 +179,7 @@ pgxsdir = $(pkglibdir)/pgxs
 #
 # Records the choice of the various --enable-xxx and --with-xxx options.
 
+with_icu	= @with_icu@
 with_perl	= @with_perl@
 with_python	= @with_python@
 with_tcl	= @with_tcl@
@@ -208,6 +209,9 @@ python_version		= @python_version@
 
 krb_srvtab = @krb_srvtab@
 
+ICU_CFLAGS		= @ICU_CFLAGS@
+ICU_LIBS		= @ICU_LIBS@
+
 TCLSH			= @TCLSH@
 TCL_LIBS		= @TCL_LIBS@
 TCL_LIB_SPEC		= @TCL_LIB_SPEC@
diff --git a/src/backend/Makefile b/src/backend/Makefile
index a1d3f067d7..df652b52bf 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -58,7 +58,7 @@ ifneq ($(PORTNAME), win32)
 ifneq ($(PORTNAME), aix)
 
 postgres: $(OBJS)
-	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) -o $@
+	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) $(ICU_LIBS) -o $@
 
 endif
 endif
diff --git a/src/backend/catalog/pg_collation.c b/src/backend/catalog/pg_collation.c
index f37cf37c4a..b337729f48 100644
--- a/src/backend/catalog/pg_collation.c
+++ b/src/backend/catalog/pg_collation.c
@@ -40,8 +40,10 @@
 Oid
 CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
-				const char *collcollate, const char *collctype)
+				const char *collcollate, const char *collctype,
+				int32 collversion)
 {
 	Relation	rel;
 	TupleDesc	tupDesc;
@@ -74,13 +76,16 @@ CollationCreate(const char *collname, Oid collnamespace,
 							  ObjectIdGetDatum(collnamespace)))
 		ereport(ERROR,
 				(errcode(ERRCODE_DUPLICATE_OBJECT),
-				 errmsg("collation \"%s\" for encoding \"%s\" already exists",
-						collname, pg_encoding_to_char(collencoding))));
+				 collencoding == -1
+				 ? errmsg("collation \"%s\" already exists",
+						  collname)
+				 : errmsg("collation \"%s\" for encoding \"%s\" already exists",
+						  collname, pg_encoding_to_char(collencoding))));
 
 	/*
 	 * Also forbid matching an any-encoding entry.  This test of course is not
 	 * backed up by the unique index, but it's not a problem since we don't
-	 * support adding any-encoding entries after initdb.
+	 * support adding any-encoding entries after initdb. FIXME
 	 */
 	if (SearchSysCacheExists3(COLLNAMEENCNSP,
 							  PointerGetDatum(collname),
@@ -102,11 +107,13 @@ CollationCreate(const char *collname, Oid collnamespace,
 	values[Anum_pg_collation_collname - 1] = NameGetDatum(&name_name);
 	values[Anum_pg_collation_collnamespace - 1] = ObjectIdGetDatum(collnamespace);
 	values[Anum_pg_collation_collowner - 1] = ObjectIdGetDatum(collowner);
+	values[Anum_pg_collation_collprovider - 1] = CharGetDatum(collprovider);
 	values[Anum_pg_collation_collencoding - 1] = Int32GetDatum(collencoding);
 	namestrcpy(&name_collate, collcollate);
 	values[Anum_pg_collation_collcollate - 1] = NameGetDatum(&name_collate);
 	namestrcpy(&name_ctype, collctype);
 	values[Anum_pg_collation_collctype - 1] = NameGetDatum(&name_ctype);
+	values[Anum_pg_collation_collversion - 1] = Int32GetDatum(collversion);
 
 	tup = heap_form_tuple(tupDesc, values, nulls);
 
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 9bba748708..88c7751ca7 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -14,11 +14,13 @@
  */
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/dependency.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/objectaccess.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_collation_fn.h"
 #include "commands/alter.h"
@@ -33,6 +35,9 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
+static int32 get_collation_version(char collprovider, const char *collcollate);
+
+
 /*
  * CREATE COLLATION
  */
@@ -47,8 +52,13 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 	DefElem    *localeEl = NULL;
 	DefElem    *lccollateEl = NULL;
 	DefElem    *lcctypeEl = NULL;
+	DefElem    *providerEl = NULL;
 	char	   *collcollate = NULL;
 	char	   *collctype = NULL;
+	char	   *collproviderstr = NULL;
+	int			collencoding;
+	char		collprovider;
+	int32		collversion;
 	Oid			newoid;
 	ObjectAddress address;
 
@@ -72,6 +82,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 			defelp = &lccollateEl;
 		else if (pg_strcasecmp(defel->defname, "lc_ctype") == 0)
 			defelp = &lcctypeEl;
+		else if (pg_strcasecmp(defel->defname, "provider") == 0)
+			defelp = &providerEl;
 		else
 		{
 			ereport(ERROR,
@@ -103,6 +115,7 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 
 		collcollate = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collcollate));
 		collctype = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collctype));
+		collprovider = ((Form_pg_collation) GETSTRUCT(tp))->collprovider;
 
 		ReleaseSysCache(tp);
 	}
@@ -119,6 +132,26 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 	if (lcctypeEl)
 		collctype = defGetString(lcctypeEl);
 
+	if (providerEl)
+		collproviderstr = defGetString(providerEl);
+
+	if (collproviderstr)
+	{
+#ifdef USE_ICU
+		if (pg_strcasecmp(collproviderstr, "icu") == 0)
+			collprovider = COLLPROVIDER_ICU;
+#endif
+		else if (pg_strcasecmp(collproviderstr, "libc") == 0)
+			collprovider = COLLPROVIDER_LIBC;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+					 errmsg("unrecognized collation provider: %s",
+							collproviderstr)));
+	}
+	else if (!fromEl)
+		collprovider = COLLPROVIDER_LIBC;
+
 	if (!collcollate)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
@@ -129,14 +162,24 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
 				 errmsg("parameter \"lc_ctype\" must be specified")));
 
-	check_encoding_locale_matches(GetDatabaseEncoding(), collcollate, collctype);
+	if (collprovider == COLLPROVIDER_ICU)
+		collencoding = -1;
+	else
+	{
+		collencoding = GetDatabaseEncoding();
+		check_encoding_locale_matches(collencoding, collcollate, collctype);
+	}
+
+	collversion = get_collation_version(collprovider, collcollate);
 
 	newoid = CollationCreate(collName,
 							 collNamespace,
 							 GetUserId(),
-							 GetDatabaseEncoding(),
+							 collprovider,
+							 collencoding,
 							 collcollate,
-							 collctype);
+							 collctype,
+							 collversion);
 
 	ObjectAddressSet(address, CollationRelationId, newoid);
 
@@ -147,6 +190,37 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 	return address;
 }
 
+static int32
+get_collation_version(char collprovider, const char *collcollate)
+{
+	int32	collversion;
+
+#ifdef USE_ICU
+	if (collprovider == COLLPROVIDER_ICU)
+	{
+		UCollator  *collator;
+		UErrorCode	status;
+		UVersionInfo versioninfo;
+
+		status = U_ZERO_ERROR;
+		collator = ucol_open(collcollate, &status);
+		if (U_FAILURE(status))
+			ereport(ERROR,
+					(errmsg("could not open collator for locale \"%s\": %s",
+							collcollate, u_errorName(status))));
+
+		ucol_getVersion(collator, versioninfo);
+		ucol_close(collator);
+
+		collversion = ntohl(*((uint32 *) versioninfo));
+	}
+	else
+#endif
+		collversion = 0;
+
+	return collversion;
+}
+
 /*
  * Subroutine for ALTER COLLATION SET SCHEMA and RENAME
  *
@@ -177,3 +251,55 @@ IsThereCollationInNamespace(const char *collname, Oid nspOid)
 				 errmsg("collation \"%s\" already exists in schema \"%s\"",
 						collname, get_namespace_name(nspOid))));
 }
+
+/*
+ * ALTER COLLATION
+ */
+ObjectAddress
+AlterCollation(AlterCollationStmt *stmt)
+{
+	Relation	rel;
+	Oid			collOid;
+	HeapTuple	tup;
+	Form_pg_collation collForm;
+	int32		newversion;
+	ObjectAddress address;
+
+	rel = heap_open(CollationRelationId, RowExclusiveLock);
+	collOid = get_collation_oid(stmt->collname, false);
+
+	if (!pg_collation_ownercheck(collOid, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_COLLATION,
+					   NameListToString(stmt->collname));
+
+	tup = SearchSysCacheCopy1(COLLOID, ObjectIdGetDatum(collOid));
+	if (!HeapTupleIsValid(tup))
+		elog(ERROR, "cache lookup failed for collation %u", collOid);
+
+	collForm = (Form_pg_collation) GETSTRUCT(tup);
+
+	newversion = get_collation_version(collForm->collprovider, NameStr(collForm->collcollate));
+	if (newversion != collForm->collversion)
+	{
+		/* This version formatting is specific to ICU. */
+		ereport(NOTICE,
+				(errmsg("changing version from 0x%08X to 0x%08X",
+						(uint32) collForm->collversion, (uint32) newversion)));
+		collForm->collversion = newversion;
+	}
+	else
+		ereport(NOTICE,
+				(errmsg("version has not changed")));
+
+	simple_heap_update(rel, &tup->t_self, tup);
+	CatalogUpdateIndexes(rel, tup);
+
+	InvokeObjectPostAlterHook(CollationRelationId, collOid, 0);
+
+	ObjectAddressSet(address, CollationRelationId, collOid);
+
+	heap_freetuple(tup);
+	heap_close(rel, NoLock);
+
+	return address;
+}
diff --git a/src/backend/common.mk b/src/backend/common.mk
index 5d599dbd0c..0b57543bc4 100644
--- a/src/backend/common.mk
+++ b/src/backend/common.mk
@@ -8,6 +8,8 @@
 # this directory and SUBDIRS to subdirectories containing more things
 # to build.
 
+override CPPFLAGS := $(CPPFLAGS) $(ICU_CFLAGS)
+
 ifdef PARTIAL_LINKING
 # old style: linking using SUBSYS.o
 subsysfilename = SUBSYS.o
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 6955298577..7afd1ee5ea 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2887,6 +2887,16 @@ _copyAlterTableCmd(const AlterTableCmd *from)
 	return newnode;
 }
 
+static AlterCollationStmt *
+_copyAlterCollationStmt(const AlterCollationStmt *from)
+{
+	AlterCollationStmt *newnode = makeNode(AlterCollationStmt);
+
+	COPY_NODE_FIELD(collname);
+
+	return newnode;
+}
+
 static AlterDomainStmt *
 _copyAlterDomainStmt(const AlterDomainStmt *from)
 {
@@ -4749,6 +4759,9 @@ copyObject(const void *from)
 		case T_AlterTableCmd:
 			retval = _copyAlterTableCmd(from);
 			break;
+		case T_AlterCollationStmt:
+			retval = _copyAlterCollationStmt(from);
+			break;
 		case T_AlterDomainStmt:
 			retval = _copyAlterDomainStmt(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 548a2aa876..2a71b584c3 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1053,6 +1053,14 @@ _equalAlterTableCmd(const AlterTableCmd *a, const AlterTableCmd *b)
 }
 
 static bool
+_equalAlterCollationStmt(const AlterCollationStmt *a, const AlterCollationStmt *b)
+{
+	COMPARE_NODE_FIELD(collname);
+
+	return true;
+}
+
+static bool
 _equalAlterDomainStmt(const AlterDomainStmt *a, const AlterDomainStmt *b)
 {
 	COMPARE_SCALAR_FIELD(subtype);
@@ -3036,6 +3044,9 @@ equal(const void *a, const void *b)
 		case T_AlterTableCmd:
 			retval = _equalAlterTableCmd(a, b);
 			break;
+		case T_AlterCollationStmt:
+			retval = _equalAlterCollationStmt(a, b);
+			break;
 		case T_AlterDomainStmt:
 			retval = _equalAlterDomainStmt(a, b);
 			break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 834a00971a..888ac8376a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -234,7 +234,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 }
 
 %type <node>	stmt schema_stmt
-		AlterEventTrigStmt
+		AlterEventTrigStmt AlterCollationStmt
 		AlterDatabaseStmt AlterDatabaseSetStmt AlterDomainStmt AlterEnumStmt
 		AlterFdwStmt AlterForeignServerStmt AlterGroupStmt
 		AlterObjectDependsStmt AlterObjectSchemaStmt AlterOwnerStmt
@@ -776,6 +776,7 @@ stmtmulti:	stmtmulti ';' stmt
 
 stmt :
 			AlterEventTrigStmt
+			| AlterCollationStmt
 			| AlterDatabaseStmt
 			| AlterDatabaseSetStmt
 			| AlterDefaultPrivilegesStmt
@@ -9443,6 +9444,21 @@ DropdbStmt: DROP DATABASE database_name
 
 /*****************************************************************************
  *
+ *		ALTER COLLATION
+ *
+ *****************************************************************************/
+
+AlterCollationStmt: ALTER COLLATION any_name REFRESH VERSION_P
+				{
+					AlterCollationStmt *n = makeNode(AlterCollationStmt);
+					n->collname = $3;
+					$$ = (Node *)n;
+				}
+		;
+
+
+/*****************************************************************************
+ *
  *		ALTER SYSTEM
  *
  * This is used to change configuration parameters persistently.
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index ad9d4b1961..ee43d5ab58 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -68,7 +68,8 @@ typedef enum
 	PG_REGEX_LOCALE_WIDE,		/* Use <wctype.h> functions */
 	PG_REGEX_LOCALE_1BYTE,		/* Use <ctype.h> functions */
 	PG_REGEX_LOCALE_WIDE_L,		/* Use locale_t <wctype.h> functions */
-	PG_REGEX_LOCALE_1BYTE_L		/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_1BYTE_L,	/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_ICU			/* Use ICU uchar.h functions */
 } PG_Locale_Strategy;
 
 static PG_Locale_Strategy pg_regex_strategy;
@@ -262,6 +263,11 @@ pg_set_regex_collation(Oid collation)
 					 errhint("Use the COLLATE clause to set the collation explicitly.")));
 		}
 
+#ifdef USE_ICU
+		if (pg_regex_locale && pg_regex_locale->provider == COLLPROVIDER_ICU)
+			pg_regex_strategy = PG_REGEX_LOCALE_ICU;
+		else
+#endif
 #ifdef USE_WIDE_UPPER_LOWER
 		if (GetDatabaseEncoding() == PG_UTF8)
 		{
@@ -303,13 +309,18 @@ pg_wc_isdigit(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswdigit_l((wint_t) c, pg_regex_locale);
+				return iswdigit_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isdigit_l((unsigned char) c, pg_regex_locale));
+					isdigit_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isdigit(c);
 #endif
 			break;
 	}
@@ -336,13 +347,18 @@ pg_wc_isalpha(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalpha_l((wint_t) c, pg_regex_locale);
+				return iswalpha_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalpha_l((unsigned char) c, pg_regex_locale));
+					isalpha_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalpha(c);
 #endif
 			break;
 	}
@@ -369,13 +385,18 @@ pg_wc_isalnum(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalnum_l((wint_t) c, pg_regex_locale);
+				return iswalnum_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalnum_l((unsigned char) c, pg_regex_locale));
+					isalnum_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalnum(c);
 #endif
 			break;
 	}
@@ -402,13 +423,18 @@ pg_wc_isupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswupper_l((wint_t) c, pg_regex_locale);
+				return iswupper_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isupper_l((unsigned char) c, pg_regex_locale));
+					isupper_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isupper(c);
 #endif
 			break;
 	}
@@ -435,13 +461,18 @@ pg_wc_islower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswlower_l((wint_t) c, pg_regex_locale);
+				return iswlower_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					islower_l((unsigned char) c, pg_regex_locale));
+					islower_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_islower(c);
 #endif
 			break;
 	}
@@ -468,13 +499,18 @@ pg_wc_isgraph(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswgraph_l((wint_t) c, pg_regex_locale);
+				return iswgraph_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isgraph_l((unsigned char) c, pg_regex_locale));
+					isgraph_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isgraph(c);
 #endif
 			break;
 	}
@@ -501,13 +537,18 @@ pg_wc_isprint(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswprint_l((wint_t) c, pg_regex_locale);
+				return iswprint_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isprint_l((unsigned char) c, pg_regex_locale));
+					isprint_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isprint(c);
 #endif
 			break;
 	}
@@ -534,13 +575,18 @@ pg_wc_ispunct(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswpunct_l((wint_t) c, pg_regex_locale);
+				return iswpunct_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					ispunct_l((unsigned char) c, pg_regex_locale));
+					ispunct_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_ispunct(c);
 #endif
 			break;
 	}
@@ -567,13 +613,18 @@ pg_wc_isspace(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswspace_l((wint_t) c, pg_regex_locale);
+				return iswspace_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isspace_l((unsigned char) c, pg_regex_locale));
+					isspace_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isspace(c);
 #endif
 			break;
 	}
@@ -608,15 +659,20 @@ pg_wc_toupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towupper_l((wint_t) c, pg_regex_locale);
+				return towupper_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return toupper_l((unsigned char) c, pg_regex_locale);
+				return toupper_l((unsigned char) c, pg_regex_locale->lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_toupper(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -649,15 +705,20 @@ pg_wc_tolower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towlower_l((wint_t) c, pg_regex_locale);
+				return towlower_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return tolower_l((unsigned char) c, pg_regex_locale);
+				return tolower_l((unsigned char) c, pg_regex_locale->lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_tolower(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -808,6 +869,9 @@ pg_ctype_get_cache(pg_wc_probefunc probefunc, int cclasscode)
 			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
 #endif
 			break;
+		case PG_REGEX_LOCALE_ICU:
+			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
+			break;
 		default:
 			max_chr = 0;		/* can't get here, but keep compiler quiet */
 			break;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 2e89ad783a..429199b610 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1562,6 +1562,10 @@ ProcessUtilitySlow(ParseState *pstate,
 				address = CreateAccessMethod((CreateAmStmt *) parsetree);
 				break;
 
+			case T_AlterCollationStmt:
+				address = AlterCollation((AlterCollationStmt *) parsetree);
+				break;
+
 			default:
 				elog(ERROR, "unrecognized node type: %d",
 					 (int) nodeTag(parsetree));
@@ -2575,6 +2579,10 @@ CreateCommandTag(Node *parsetree)
 			tag = "CREATE ACCESS METHOD";
 			break;
 
+		case T_AlterCollationStmt:
+			tag = "ALTER COLLATION";
+			break;
+
 		case T_PrepareStmt:
 			tag = "PREPARE";
 			break;
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index d4eaa506e8..f3b61707b9 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -82,6 +82,10 @@
 #include <wctype.h>
 #endif
 
+#ifdef USE_ICU
+#include <unicode/ustring.h>
+#endif
+
 #include "catalog/pg_collation.h"
 #include "mb/pg_wchar.h"
 #include "utils/builtins.h"
@@ -1443,6 +1447,42 @@ str_numth(char *dest, char *num, int type)
  *			upper/lower/initcap functions
  *****************************************************************************/
 
+#ifdef USE_ICU
+static int32_t
+icu_convert_case(int32_t (*func)(UChar *, int32_t, const UChar *, int32_t, const char *, UErrorCode *),
+				 pg_locale_t mylocale, UChar **buff_dest, UChar *buff_source, int32_t len_source)
+{
+	UErrorCode	status;
+	int32_t		len_dest;
+
+	len_dest = len_source;  /* try first with same length */
+	*buff_dest = palloc(len_dest * sizeof(**buff_dest));
+	status = U_ZERO_ERROR;
+	len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->locale, &status);
+	if (status == U_BUFFER_OVERFLOW_ERROR)
+	{
+		/* try again with adjusted length */
+		pfree(buff_dest);
+		buff_dest = palloc(len_dest * sizeof(**buff_dest));
+		status = U_ZERO_ERROR;
+		len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->locale, &status);
+	}
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("case conversion failed: %s", u_errorName(status))));
+	return len_dest;
+}
+
+static int32_t
+u_strToTitle_default_BI(UChar *dest, int32_t destCapacity,
+						const UChar *src, int32_t srcLength,
+						const char *locale,
+						UErrorCode *pErrorCode)
+{
+	return u_strToTitle(dest, destCapacity, src, srcLength, NULL, locale, pErrorCode);
+}
+#endif
+
 /*
  * If the system provides the needed functions for wide-character manipulation
  * (which are all standardized by C99), then we implement upper/lower/initcap
@@ -1479,12 +1519,9 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 		result = asc_tolower(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1502,77 +1539,79 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar;
+			int32_t		len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(mylocale, &buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToLower, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(mylocale, &result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-			else
+					if (mylocale)
+						workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->lt);
+					else
 #endif
-				workspace[curr_char] = towlower(workspace[curr_char]);
-		}
+						workspace[curr_char] = towlower(workspace[curr_char]);
+				}
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
+			}
 #endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
-
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
+			else
 			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for lower() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that tolower_l() will not be so broken as to need
-		 * an isupper_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that tolower_l() will not be so broken as to need
+				 * an isupper_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = tolower_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = tolower_l((unsigned char) *p, mylocale->lt);
+					else
 #endif
-				*p = pg_tolower((unsigned char) *p);
+						*p = pg_tolower((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1599,12 +1638,9 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 		result = asc_toupper(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1622,77 +1658,78 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+			len_uchar = icu_to_uchar(mylocale, &buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToUpper, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(mylocale, &result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-			else
-#endif
-				workspace[curr_char] = towupper(workspace[curr_char]);
-		}
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
+					if (mylocale)
+						workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->lt);
+					else
 #endif
-		char	   *p;
+						workspace[curr_char] = towupper(workspace[curr_char]);
+				}
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for upper() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
+
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+#endif   /* USE_WIDE_UPPER_LOWER */
+			else
+			{
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that toupper_l() will not be so broken as to need
-		 * an islower_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that toupper_l() will not be so broken as to need
+				 * an islower_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = toupper_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = toupper_l((unsigned char) *p, mylocale->lt);
+					else
 #endif
-				*p = pg_toupper((unsigned char) *p);
+						*p = pg_toupper((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1720,12 +1757,9 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 		result = asc_initcap(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1743,100 +1777,101 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
-
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
-
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
-
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
 		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-				else
-					workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-				wasalnum = iswalnum_l(workspace[curr_char], mylocale);
-			}
-			else
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(mylocale, &buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(mylocale, &result, buff_conv, len_conv);
+		}
+		else
 #endif
+		{
+			if (pg_database_encoding_max_length() > 1)
 			{
-				if (wasalnum)
-					workspace[curr_char] = towlower(workspace[curr_char]);
-				else
-					workspace[curr_char] = towupper(workspace[curr_char]);
-				wasalnum = iswalnum(workspace[curr_char]);
-			}
-		}
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for initcap() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
+					if (mylocale)
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->lt);
+						else
+							workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->lt);
+						wasalnum = iswalnum_l(workspace[curr_char], mylocale->lt);
+					}
+					else
 #endif
-		}
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower(workspace[curr_char]);
+						else
+							workspace[curr_char] = towupper(workspace[curr_char]);
+						wasalnum = iswalnum(workspace[curr_char]);
+					}
+				}
 
-		result = pnstrdup(buff, nbytes);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		/*
-		 * Note: we assume that toupper_l()/tolower_l() will not be so broken
-		 * as to need guard tests.  When using the default collation, we apply
-		 * the traditional Postgres behavior that forces ASCII-style treatment
-		 * of I/i, but in non-default collations you get exactly what the
-		 * collation says.
-		 */
-		for (p = result; *p; p++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					*p = tolower_l((unsigned char) *p, mylocale);
-				else
-					*p = toupper_l((unsigned char) *p, mylocale);
-				wasalnum = isalnum_l((unsigned char) *p, mylocale);
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
+#endif   /* USE_WIDE_UPPER_LOWER */
 			else
-#endif
 			{
-				if (wasalnum)
-					*p = pg_tolower((unsigned char) *p);
-				else
-					*p = pg_toupper((unsigned char) *p);
-				wasalnum = isalnum((unsigned char) *p);
+				char	   *p;
+
+				result = pnstrdup(buff, nbytes);
+
+				/*
+				 * Note: we assume that toupper_l()/tolower_l() will not be so broken
+				 * as to need guard tests.  When using the default collation, we apply
+				 * the traditional Postgres behavior that forces ASCII-style treatment
+				 * of I/i, but in non-default collations you get exactly what the
+				 * collation says.
+				 */
+				for (p = result; *p; p++)
+				{
+#ifdef HAVE_LOCALE_T
+					if (mylocale)
+					{
+						if (wasalnum)
+							*p = tolower_l((unsigned char) *p, mylocale->lt);
+						else
+							*p = toupper_l((unsigned char) *p, mylocale->lt);
+						wasalnum = isalnum_l((unsigned char) *p, mylocale->lt);
+					}
+					else
+#endif
+					{
+						if (wasalnum)
+							*p = pg_tolower((unsigned char) *p);
+						else
+							*p = pg_toupper((unsigned char) *p);
+						wasalnum = isalnum((unsigned char) *p);
+					}
+				}
 			}
 		}
 	}
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index 08b86f5168..1482707988 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -96,7 +96,7 @@ SB_lower_char(unsigned char c, pg_locale_t locale, bool locale_is_c)
 		return pg_ascii_tolower(c);
 #ifdef HAVE_LOCALE_T
 	else if (locale)
-		return tolower_l(c, locale);
+		return tolower_l(c, locale->lt);
 #endif
 	else
 		return pg_tolower(c);
@@ -165,14 +165,36 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 			   *p;
 	int			slen,
 				plen;
+	pg_locale_t locale = 0;
+	bool		locale_is_c = false;
+
+	if (lc_ctype_is_c(collation))
+		locale_is_c = true;
+	else if (collation != DEFAULT_COLLATION_OID)
+	{
+		if (!OidIsValid(collation))
+		{
+			/*
+			 * This typically means that the parser could not resolve a
+			 * conflict of implicit collations, so report it that way.
+			 */
+			ereport(ERROR,
+					(errcode(ERRCODE_INDETERMINATE_COLLATION),
+					 errmsg("could not determine which collation to use for ILIKE"),
+					 errhint("Use the COLLATE clause to set the collation explicitly.")));
+		}
+		locale = pg_newlocale_from_collation(collation);
+	}
 
 	/*
 	 * For efficiency reasons, in the single byte case we don't call lower()
 	 * on the pattern and text, but instead call SB_lower_char on each
-	 * character.  In the multi-byte case we don't have much choice :-(
+	 * character.  In the multi-byte case we don't have much choice :-(.
+	 * Also, ICU does not support single-character case folding, so we go the
+	 * long way.
 	 */
 
-	if (pg_database_encoding_max_length() > 1)
+	if (pg_database_encoding_max_length() > 1 || locale->provider == COLLPROVIDER_ICU)
 	{
 		/* lower's result is never packed, so OK to use old macros here */
 		pat = DatumGetTextP(DirectFunctionCall1Coll(lower, collation,
@@ -190,31 +212,6 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 	}
 	else
 	{
-		/*
-		 * Here we need to prepare locale information for SB_lower_char. This
-		 * should match the methods used in str_tolower().
-		 */
-		pg_locale_t locale = 0;
-		bool		locale_is_c = false;
-
-		if (lc_ctype_is_c(collation))
-			locale_is_c = true;
-		else if (collation != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collation))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for ILIKE"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-			locale = pg_newlocale_from_collation(collation);
-		}
-
 		p = VARDATA_ANY(pat);
 		plen = VARSIZE_ANY_EXHDR(pat);
 		s = VARDATA_ANY(str);
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 8ac384f294..854e3b912c 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -58,7 +58,9 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_control.h"
 #include "mb/pg_wchar.h"
+#include "utils/builtins.h"
 #include "utils/hsearch.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
 #include "utils/syscache.h"
@@ -1288,45 +1290,105 @@ pg_newlocale_from_collation(Oid collid)
 		collcollate = NameStr(collform->collcollate);
 		collctype = NameStr(collform->collctype);
 
-		if (strcmp(collcollate, collctype) == 0)
+		cache_entry->locale = malloc(sizeof(* cache_entry->locale));
+		memset(cache_entry->locale, 0, sizeof(* cache_entry->locale));
+		cache_entry->locale->provider = collform->collprovider;
+
+		if (collform->collprovider == COLLPROVIDER_LIBC)
 		{
-			/* Normal case where they're the same */
+			if (strcmp(collcollate, collctype) == 0)
+			{
+				/* Normal case where they're the same */
 #ifndef WIN32
-			result = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
-							   NULL);
+				result = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
+								   NULL);
 #else
-			result = _create_locale(LC_ALL, collcollate);
+				result = _create_locale(LC_ALL, collcollate);
 #endif
-			if (!result)
-				report_newlocale_failure(collcollate);
-		}
-		else
-		{
+				if (!result)
+					report_newlocale_failure(collcollate);
+			}
+			else
+			{
 #ifndef WIN32
-			/* We need two newlocale() steps */
-			locale_t	loc1;
-
-			loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
-			if (!loc1)
-				report_newlocale_failure(collcollate);
-			result = newlocale(LC_CTYPE_MASK, collctype, loc1);
-			if (!result)
-				report_newlocale_failure(collctype);
+				/* We need two newlocale() steps */
+				locale_t	loc1;
+
+				loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
+				if (!loc1)
+					report_newlocale_failure(collcollate);
+				result = newlocale(LC_CTYPE_MASK, collctype, loc1);
+				if (!result)
+					report_newlocale_failure(collctype);
 #else
 
-			/*
-			 * XXX The _create_locale() API doesn't appear to support this.
-			 * Could perhaps be worked around by changing pg_locale_t to
-			 * contain two separate fields.
-			 */
+				/*
+				 * XXX The _create_locale() API doesn't appear to support this.
+				 * Could perhaps be worked around by changing pg_locale_t to
+				 * contain two separate fields.
+				 */
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("collations with different collate and ctype values are not supported on this platform")));
+#endif
+			}
+
+			cache_entry->locale->lt = result;
+		}
+		else if (collform->collprovider == COLLPROVIDER_ICU)
+		{
+#ifdef USE_ICU
+			const char *collcollate;
+			UCollator  *collator;
+			const char *encstring;
+			UConverter *converter;
+			UErrorCode	status;
+			UVersionInfo versioninfo;
+			int32		numversion;
+
+			collcollate = NameStr(collform->collcollate);
+
+			status = U_ZERO_ERROR;
+			collator = ucol_open(collcollate, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not open collator for locale \"%s\": %s",
+								collcollate, u_errorName(status))));
+
+			encstring = (&pg_enc2name_tbl[GetDatabaseEncoding()])->name;
+
+			status = U_ZERO_ERROR;
+			converter = ucnv_open(encstring, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not open converter for encoding \"%s\": %s",
+								encstring, u_errorName(status))));
+
+			ucol_getVersion(collator, versioninfo);
+			numversion = ntohl(*((uint32 *) versioninfo));
+
+			if (numversion != collform->collversion)
+				ereport(WARNING,
+						(errmsg("ICU collator version mismatch"),
+						 errdetail("The database was created using version 0x%08X, the library provides version 0x%08X.",
+								   (uint32) collform->collversion, (uint32) numversion),
+						 errhint("Rebuild all objects affected by this collation and run ALTER COLLATION %s REFRESH VERSION, "
+								 "or build PostgreSQL with the right version of ICU.",
+								 quote_qualified_identifier(get_namespace_name(collform->collnamespace),
+															NameStr(collform->collname)))));
+
+			cache_entry->locale->ucol = collator;
+			cache_entry->locale->locale = strdup(collcollate);
+			cache_entry->locale->ucnv = converter;
+#else /* not USE_ICU */
+			/* could get here if a collation was created by a build with ICU */
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					 errmsg("collations with different collate and ctype values are not supported on this platform")));
-#endif
+					 errmsg("ICU is not supported in this build"), \
+					 errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif /* not USE_ICU */
 		}
 
-		cache_entry->locale = result;
-
 		ReleaseSysCache(tp);
 #else							/* not HAVE_LOCALE_T */
 
@@ -1344,6 +1406,40 @@ pg_newlocale_from_collation(Oid collid)
 }
 
 
+#ifdef USE_ICU
+int32_t
+icu_to_uchar(pg_locale_t mylocale, UChar **buff_uchar, const char *buff, size_t nbytes)
+{
+	UErrorCode	status;
+	int32_t		len_uchar;
+
+	len_uchar = 2 * nbytes;  /* max length per docs */
+	*buff_uchar = palloc(len_uchar * sizeof(**buff_uchar));
+	status = U_ZERO_ERROR;
+	len_uchar = ucnv_toUChars(mylocale->ucnv, *buff_uchar, len_uchar, buff, nbytes, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_toUChars failed: %s", u_errorName(status))));
+	return len_uchar;
+}
+
+int32_t
+icu_from_uchar(pg_locale_t mylocale, char **result, UChar *buff_uchar, int32_t len_uchar)
+{
+	UErrorCode	status;
+	int32_t		len_result;
+
+	len_result = UCNV_GET_MAX_BYTES_FOR_STRING(len_uchar, ucnv_getMaxCharSize(mylocale->ucnv));
+	*result = palloc(len_result + 1);
+	status = U_ZERO_ERROR;
+	ucnv_fromUChars(mylocale->ucnv, *result, len_result, buff_uchar, len_uchar, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_fromUChars failed: %s", u_errorName(status))));
+	return len_result;
+}
+#endif
+
 /*
  * These functions convert from/to libc's wchar_t, *not* pg_wchar_t.
  * Therefore we keep them here rather than with the mbutils code.
@@ -1363,6 +1459,8 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1399,7 +1497,7 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_WCSTOMBS_L
 		/* Use wcstombs_l for nondefault locales */
-		result = wcstombs_l(to, from, tolen, locale);
+		result = wcstombs_l(to, from, tolen, locale->lt);
 #else							/* !HAVE_WCSTOMBS_L */
 		/* We have to temporarily set the locale as current ... ugh */
 		locale_t	save_locale = uselocale(locale);
@@ -1433,6 +1531,8 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1474,7 +1574,7 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_MBSTOWCS_L
 			/* Use mbstowcs_l for nondefault locales */
-			result = mbstowcs_l(to, str, tolen, locale);
+			result = mbstowcs_l(to, str, tolen, locale->lt);
 #else							/* !HAVE_MBSTOWCS_L */
 			/* We have to temporarily set the locale as current ... ugh */
 			locale_t	save_locale = uselocale(locale);
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 4973396b80..f7c81a2b13 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5274,7 +5274,7 @@ find_join_input_rel(PlannerInfo *root, Relids relids)
 /*
  * Check whether char is a letter (and, hence, subject to case-folding)
  *
- * In multibyte character sets, we can't use isalpha, and it does not seem
+ * In multibyte character sets or with ICU, we can't use isalpha, and it does not seem
  * worth trying to convert to wchar_t to use iswalpha.  Instead, just assume
  * any multibyte char is potentially case-varying.
  */
@@ -5284,11 +5284,11 @@ pattern_char_isalpha(char c, bool is_multibyte,
 {
 	if (locale_is_c)
 		return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
-	else if (is_multibyte && IS_HIGHBIT_SET(c))
+	else if ((is_multibyte || locale->provider == COLLPROVIDER_ICU) && IS_HIGHBIT_SET(c))
 		return true;
 #ifdef HAVE_LOCALE_T
 	else if (locale)
-		return isalpha_l((unsigned char) c, locale);
+		return isalpha_l((unsigned char) c, locale->lt);
 #endif
 	else
 		return isalpha((unsigned char) c);
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 260a5aac49..6af649297a 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1402,10 +1402,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		char		a2buf[TEXTBUFLEN];
 		char	   *a1p,
 				   *a2p;
-
-#ifdef HAVE_LOCALE_T
 		pg_locale_t mylocale = 0;
-#endif
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1420,9 +1417,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			mylocale = pg_newlocale_from_collation(collid);
-#endif
 		}
 
 		/*
@@ -1541,9 +1536,44 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		memcpy(a2p, arg2, len2);
 		a2p[len2] = '\0';
 
-#ifdef HAVE_LOCALE_T
 		if (mylocale)
-			result = strcoll_l(a1p, a2p, mylocale);
+		{
+#ifdef USE_ICU
+			if (mylocale->provider == COLLPROVIDER_ICU)
+			{
+#ifdef HAVE_UCOL_STRCOLLUTF8
+				if (GetDatabaseEncoding() == PG_UTF8)
+				{
+					UErrorCode	status;
+
+					status = U_ZERO_ERROR;
+					result = ucol_strcollUTF8(mylocale->ucol,
+											  arg1, len1,
+											  arg2, len2,
+											  &status);
+					if (U_FAILURE(status))
+						ereport(ERROR,
+								(errmsg("collation failed: %s", u_errorName(status))));
+				}
+				else
+#endif
+				{
+					int32_t ulen1, ulen2;
+					UChar *uchar1, *uchar2;
+
+					ulen1 = icu_to_uchar(mylocale, &uchar1, arg1, len1);
+					ulen2 = icu_to_uchar(mylocale, &uchar2, arg2, len2);
+
+					result = ucol_strcoll(mylocale->ucol,
+										  uchar1, ulen1,
+										  uchar2, ulen2);
+				}
+			}
+			else
+#endif
+#ifdef HAVE_LOCALE_T
+				result = strcoll_l(a1p, a2p, mylocale->lt);
+		}
 		else
 #endif
 			result = strcoll(a1p, a2p);
@@ -1767,10 +1797,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	bool		abbreviate = ssup->abbreviate;
 	bool		collate_c = false;
 	VarStringSortSupport *sss;
-
-#ifdef HAVE_LOCALE_T
 	pg_locale_t locale = 0;
-#endif
 
 	/*
 	 * If possible, set ssup->comparator to a function which can be used to
@@ -1825,9 +1852,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			locale = pg_newlocale_from_collation(collid);
-#endif
 		}
 	}
 
@@ -1853,7 +1878,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	 * platforms.
 	 */
 #ifndef TRUST_STRXFRM
-	if (!collate_c)
+	if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
 		abbreviate = false;
 #endif
 
@@ -2089,9 +2114,44 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 		goto done;
 	}
 
-#ifdef HAVE_LOCALE_T
 	if (sss->locale)
-		result = strcoll_l(sss->buf1, sss->buf2, sss->locale);
+	{
+#ifdef USE_ICU
+		if (sss->locale->provider == COLLPROVIDER_ICU)
+		{
+#ifdef HAVE_UCOL_STRCOLLUTF8
+			if (GetDatabaseEncoding() == PG_UTF8)
+			{
+				UErrorCode	status;
+
+				status = U_ZERO_ERROR;
+				result = ucol_strcollUTF8(sss->locale->ucol,
+										  a1p, len1,
+										  a2p, len2,
+										  &status);
+				if (U_FAILURE(status))
+					ereport(ERROR,
+							(errmsg("collation failed: %s", u_errorName(status))));
+			}
+			else
+#endif
+			{
+				int32_t ulen1, ulen2;
+				UChar *uchar1, *uchar2;
+
+				ulen1 = icu_to_uchar(sss->locale, &uchar1, a1p, len1);
+				ulen2 = icu_to_uchar(sss->locale, &uchar2, a2p, len2);
+
+				result = ucol_strcoll(sss->locale->ucol,
+									  uchar1, ulen1,
+									  uchar2, ulen2);
+			}
+		}
+		else
+#endif
+#ifdef HAVE_LOCALE_T
+			result = strcoll_l(sss->buf1, sss->buf2, sss->locale->lt);
+	}
 	else
 #endif
 		result = strcoll(sss->buf1, sss->buf2);
@@ -2199,6 +2259,10 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 	else
 	{
 		Size		bsize;
+#ifdef USE_ICU
+		int32_t		ulen = -1;
+		UChar	   *uchar;
+#endif
 
 		/*
 		 * We're not using the C collation, so fall back on strxfrm.
@@ -2226,12 +2290,22 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 		sss->buf1[len] = '\0';
 		sss->last_len1 = len;
 
+#ifdef USE_ICU
+		if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+			ulen = icu_to_uchar(sss->locale, &uchar, sss->buf1, len);
+#endif
+
 		for (;;)
 		{
+#ifdef USE_ICU
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+				bsize = ucol_getSortKey(sss->locale->ucol, uchar, ulen, (uint8_t *) sss->buf2, sss->buflen2);
+			else
+#endif
 #ifdef HAVE_LOCALE_T
-			if (sss->locale)
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_LIBC)
 				bsize = strxfrm_l(sss->buf2, sss->buf1,
-								  sss->buflen2, sss->locale);
+								  sss->buflen2, sss->locale->lt);
 			else
 #endif
 				bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
diff --git a/src/bin/initdb/Makefile b/src/bin/initdb/Makefile
index 394eae0875..404fd33a8c 100644
--- a/src/bin/initdb/Makefile
+++ b/src/bin/initdb/Makefile
@@ -16,7 +16,7 @@ subdir = src/bin/initdb
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) -I$(top_srcdir)/src/timezone $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) -I$(top_srcdir)/src/timezone $(CPPFLAGS) $(ICU_CFLAGS)
 
 # note: we need libpq only because fe_utils does
 LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
@@ -31,7 +31,7 @@ OBJS=	initdb.o findtimezone.o localtime.o encnames.o $(WIN32RES)
 all: initdb
 
 initdb: $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils
-	$(CC) $(CFLAGS) $(OBJS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
+	$(CC) $(CFLAGS) $(OBJS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) $(ICU_LIBS) -o $@$(X)
 
 # We used to pull in all of libpq to get encnames.c, but that
 # exposes us to risks of version skew if we link to a shared library.
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 24f9cc8eae..9564113386 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -70,6 +70,11 @@
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 
+#ifdef USE_ICU
+#include <unicode/ucol.h>
+#include <unicode/uloc.h>
+#endif
+
 
 /* Ideally this would be in a .h file, but it hardly seems worth the trouble */
 extern const char *select_default_timezone(const char *share_path);
@@ -319,6 +324,12 @@ do { \
 		output_failed = true, output_errno = errno; \
 } while (0)
 
+#define PG_CMD_PRINTF4(fmt, arg1, arg2, arg3, arg4)		\
+do { \
+	if (fprintf(cmdfd, fmt, arg1, arg2, arg3, arg4) < 0 || fflush(cmdfd) < 0) \
+		output_failed = true, output_errno = errno; \
+} while (0)
+
 static char *
 escape_quotes(const char *src)
 {
@@ -1756,12 +1767,12 @@ setup_collation(FILE *cmdfd)
 	 * Also, eliminate any aliases that conflict with pg_collation's
 	 * hard-wired entries for "C" etc.
 	 */
-	PG_CMD_PUTS("INSERT INTO pg_collation (collname, collnamespace, collowner, collencoding, collcollate, collctype) "
+	PG_CMD_PUTS("INSERT INTO pg_collation (collname, collnamespace, collowner, collprovider, collencoding, collcollate, collctype, collversion) "
 				" SELECT DISTINCT ON (collname, encoding)"
 				"   collname, "
 				"   (SELECT oid FROM pg_namespace WHERE nspname = 'pg_catalog') AS collnamespace, "
 				"   (SELECT relowner FROM pg_class WHERE relname = 'pg_collation') AS collowner, "
-				"   encoding, locale, locale "
+				"   'p', encoding, locale, locale, 0 "
 				"  FROM tmp_pg_collation"
 				"  WHERE NOT EXISTS (SELECT 1 FROM pg_collation WHERE collname = tmp_pg_collation.collname)"
 	 "  ORDER BY collname, encoding, (collname = locale) DESC, locale;\n\n");
@@ -1780,6 +1791,41 @@ setup_collation(FILE *cmdfd)
 		printf(_("Use the option \"--debug\" to see details.\n"));
 	}
 #endif   /* not HAVE_LOCALE_T  && not WIN32 */
+
+#ifdef USE_ICU
+	{
+		int i;
+
+		for (i = 0; i < uloc_countAvailable(); i++)
+		{
+			const char *name = uloc_getAvailable(i);
+			UCollator  *collator;
+			UErrorCode	status;
+			UVersionInfo versioninfo;
+			int32		collversion;
+
+			status = U_ZERO_ERROR;
+			collator = ucol_open(name, &status);
+			if (U_FAILURE(status))
+			{
+				fprintf(stderr, "%s: could not open collator for locale \"%s\": %s",
+						progname, name, u_errorName(status));
+				exit_nicely();
+			}
+			ucol_getVersion(collator, versioninfo);
+			ucol_close(collator);
+
+			collversion = ntohl(*((uint32 *) versioninfo));
+
+			PG_CMD_PRINTF4("INSERT INTO pg_collation (collname, collnamespace, collowner, collprovider, collencoding, collcollate, collctype, collversion) "
+						   "VALUES ('%s%%icu', "
+						   "(SELECT oid FROM pg_namespace WHERE nspname = 'pg_catalog'), "
+						   "(SELECT relowner FROM pg_class WHERE relname = 'pg_collation'), "
+						   "'i', -1, '%s', '%s', %d);",
+						   name, name, name, collversion);
+		}
+	}
+#endif
 }
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e5545b31d4..df5a7107cb 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -12262,8 +12262,10 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	PQExpBuffer delq;
 	PQExpBuffer labelq;
 	PGresult   *res;
+	int			i_collprovider;
 	int			i_collcollate;
 	int			i_collctype;
+	const char *collprovider;
 	const char *collcollate;
 	const char *collctype;
 
@@ -12280,18 +12282,30 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	selectSourceSchema(fout, collinfo->dobj.namespace->dobj.name);
 
 	/* Get collation-specific details */
-	appendPQExpBuffer(query, "SELECT "
-					  "collcollate, "
-					  "collctype "
-					  "FROM pg_catalog.pg_collation c "
-					  "WHERE c.oid = '%u'::pg_catalog.oid",
-					  collinfo->dobj.catId.oid);
+	if (fout->remoteVersion >= 100000)
+		appendPQExpBuffer(query, "SELECT "
+						  "collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
+	else
+		appendPQExpBuffer(query, "SELECT "
+						  "'p'::char AS collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
 
 	res = ExecuteSqlQueryForSingleRow(fout, query->data);
 
+	i_collprovider = PQfnumber(res, "collprovider");
 	i_collcollate = PQfnumber(res, "collcollate");
 	i_collctype = PQfnumber(res, "collctype");
 
+	collprovider = PQgetvalue(res, 0, i_collprovider);
 	collcollate = PQgetvalue(res, 0, i_collcollate);
 	collctype = PQgetvalue(res, 0, i_collctype);
 
@@ -12303,11 +12317,32 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	appendPQExpBuffer(delq, ".%s;\n",
 					  fmtId(collinfo->dobj.name));
 
-	appendPQExpBuffer(q, "CREATE COLLATION %s (lc_collate = ",
+	appendPQExpBuffer(q, "CREATE COLLATION %s (",
 					  fmtId(collinfo->dobj.name));
-	appendStringLiteralAH(q, collcollate, fout);
-	appendPQExpBufferStr(q, ", lc_ctype = ");
-	appendStringLiteralAH(q, collctype, fout);
+
+	appendPQExpBufferStr(q, "provider = ");
+	if (collprovider[0] == 'c')
+		appendStringLiteralAH(q, "libc", fout);
+	else if (collprovider[0] == 'i')
+		appendStringLiteralAH(q, "icu", fout);
+	else
+		exit_horribly(NULL,
+					  "unrecognized collation provider: %s\n",
+					  collprovider);
+
+	if (strcmp(collcollate, collctype) == 0)
+	{
+		appendPQExpBufferStr(q, ", locale = ");
+		appendStringLiteralAH(q, collcollate, fout);
+	}
+	else
+	{
+		appendPQExpBufferStr(q, ", lc_collate = ");
+		appendStringLiteralAH(q, collcollate, fout);
+		appendPQExpBufferStr(q, ", lc_ctype = ");
+		appendStringLiteralAH(q, collctype, fout);
+	}
+
 	appendPQExpBufferStr(q, ");\n");
 
 	appendPQExpBuffer(labelq, "COLLATION %s", fmtId(collinfo->dobj.name));
diff --git a/src/include/catalog/pg_collation.h b/src/include/catalog/pg_collation.h
index 636771202e..3a4a4b05b4 100644
--- a/src/include/catalog/pg_collation.h
+++ b/src/include/catalog/pg_collation.h
@@ -34,9 +34,11 @@ CATALOG(pg_collation,3456)
 	NameData	collname;		/* collation name */
 	Oid			collnamespace;	/* OID of namespace containing collation */
 	Oid			collowner;		/* owner of collation */
+	char		collprovider;	/* see constants below */
 	int32		collencoding;	/* encoding for this collation; -1 = "all" */
 	NameData	collcollate;	/* LC_COLLATE setting */
 	NameData	collctype;		/* LC_CTYPE setting */
+	int32		collversion;	/* provider-dependent version of collation data */
 } FormData_pg_collation;
 
 /* ----------------
@@ -50,27 +52,34 @@ typedef FormData_pg_collation *Form_pg_collation;
  *		compiler constants for pg_collation
  * ----------------
  */
-#define Natts_pg_collation				6
+#define Natts_pg_collation				8
 #define Anum_pg_collation_collname		1
 #define Anum_pg_collation_collnamespace 2
 #define Anum_pg_collation_collowner		3
-#define Anum_pg_collation_collencoding	4
-#define Anum_pg_collation_collcollate	5
-#define Anum_pg_collation_collctype		6
+#define Anum_pg_collation_collprovider	4
+#define Anum_pg_collation_collencoding	5
+#define Anum_pg_collation_collcollate	6
+#define Anum_pg_collation_collctype		7
+#define Anum_pg_collation_collversion	8
 
 /* ----------------
  *		initial contents of pg_collation
  * ----------------
  */
 
-DATA(insert OID = 100 ( default		PGNSP PGUID -1 "" "" ));
+DATA(insert OID = 100 ( default		PGNSP PGUID d -1 "" "" 0 ));
 DESCR("database's default collation");
 #define DEFAULT_COLLATION_OID	100
-DATA(insert OID = 950 ( C			PGNSP PGUID -1 "C" "C" ));
+DATA(insert OID = 950 ( C			PGNSP PGUID c -1 "C" "C" 0 ));
 DESCR("standard C collation");
 #define C_COLLATION_OID			950
-DATA(insert OID = 951 ( POSIX		PGNSP PGUID -1 "POSIX" "POSIX" ));
+DATA(insert OID = 951 ( POSIX		PGNSP PGUID c -1 "POSIX" "POSIX" 0 ));
 DESCR("standard POSIX collation");
 #define POSIX_COLLATION_OID		951
 
+
+#define COLLPROVIDER_DEFAULT	'd'
+#define COLLPROVIDER_ICU		'i'
+#define COLLPROVIDER_LIBC		'c'
+
 #endif   /* PG_COLLATION_H */
diff --git a/src/include/catalog/pg_collation_fn.h b/src/include/catalog/pg_collation_fn.h
index 574b288acc..bda83b9c22 100644
--- a/src/include/catalog/pg_collation_fn.h
+++ b/src/include/catalog/pg_collation_fn.h
@@ -16,8 +16,10 @@
 
 extern Oid CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+						   char collprovider,
 				int32 collencoding,
-				const char *collcollate, const char *collctype);
+						   const char *collcollate, const char *collctype,
+						   int32 collversion);
 extern void RemoveCollationById(Oid collationOid);
 
 #endif   /* PG_COLLATION_FN_H */
diff --git a/src/include/commands/collationcmds.h b/src/include/commands/collationcmds.h
index 073314e76d..f6134a9b72 100644
--- a/src/include/commands/collationcmds.h
+++ b/src/include/commands/collationcmds.h
@@ -20,5 +20,6 @@
 
 extern ObjectAddress DefineCollation(ParseState *pstate, List *names, List *parameters);
 extern void IsThereCollationInNamespace(const char *collname, Oid nspOid);
+extern ObjectAddress AlterCollation(AlterCollationStmt *stmt);
 
 #endif   /* COLLATIONCMDS_H */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index c514d3fc93..7d235c01e3 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -407,6 +407,7 @@ typedef enum NodeTag
 	T_CreateTransformStmt,
 	T_CreateAmStmt,
 	T_PartitionCmd,
+	T_AlterCollationStmt,
 
 	/*
 	 * TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index fc532fbd43..1bf17f3a74 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1656,6 +1656,17 @@ typedef struct AlterTableCmd	/* one subcommand of an ALTER TABLE */
 
 
 /* ----------------------
+ * Alter Collation
+ * ----------------------
+ */
+typedef struct AlterCollationStmt
+{
+	NodeTag		type;
+	List	   *collname;
+} AlterCollationStmt;
+
+
+/* ----------------------
  *	Alter Domain
  *
  * The fields are used in different ways by the different variants of
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 42a3fc862e..0ac0e9ed5b 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -615,6 +615,9 @@
 /* Define to 1 if you have the external array `tzname'. */
 #undef HAVE_TZNAME
 
+/* Define to 1 if you have the `ucol_strcollUTF8' function. */
+#undef HAVE_UCOL_STRCOLLUTF8
+
 /* Define to 1 if you have the <ucred.h> header file. */
 #undef HAVE_UCRED_H
 
@@ -828,6 +831,9 @@
    (--enable-float8-byval) */
 #undef USE_FLOAT8_BYVAL
 
+/* Define to build with ICU support. (--with-icu) */
+#undef USE_ICU
+
 /* Define to 1 if you want 64-bit integer timestamp and interval support.
    (--enable-integer-datetimes) */
 #undef USE_INTEGER_DATETIMES
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 0a4b9f7bd2..bb481194e3 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -16,6 +16,10 @@
 #if defined(LOCALE_T_IN_XLOCALE) || defined(WCSTOMBS_L_IN_XLOCALE)
 #include <xlocale.h>
 #endif
+#ifdef USE_ICU
+#include <unicode/ucnv.h>
+#include <unicode/ucol.h>
+#endif
 
 #include "utils/guc.h"
 
@@ -62,17 +66,30 @@ extern void cache_locale_time(void);
  * We define our own wrapper around locale_t so we can keep the same
  * function signatures for all builds, while not having to create a
  * fake version of the standard type locale_t in the global namespace.
- * The fake version of pg_locale_t can be checked for truth; that's
- * about all it will be needed for.
+ * pg_locale_t is occasionally checked for truth, so make it a pointer.
  */
+struct pg_locale_t
+{
+	char	provider;
 #ifdef HAVE_LOCALE_T
-typedef locale_t pg_locale_t;
-#else
-typedef int pg_locale_t;
+	locale_t lt;
+#endif
+#ifdef USE_ICU
+	const char *locale;
+	UCollator *ucol;
+	UConverter *ucnv;
 #endif
+};
+
+typedef struct pg_locale_t *pg_locale_t;
 
 extern pg_locale_t pg_newlocale_from_collation(Oid collid);
 
+#ifdef USE_ICU
+extern int32_t icu_to_uchar(pg_locale_t mylocale, UChar **buff_uchar, const char *buff, size_t nbytes);
+extern int32_t icu_from_uchar(pg_locale_t mylocale, char **result, UChar *buff_uchar, int32_t len_uchar);
+#endif
+
 /* These functions convert from/to libc's wchar_t, *not* pg_wchar_t */
 #ifdef USE_WIDE_UPPER_LOWER
 extern size_t wchar2char(char *to, const wchar_t *from, size_t tolen,
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index 469b0937a2..dce0180ec4 100644
--- a/src/test/regress/GNUmakefile
+++ b/src/test/regress/GNUmakefile
@@ -125,6 +125,9 @@ tablespace-setup:
 ##
 
 REGRESS_OPTS = --dlpath=. $(EXTRA_REGRESS_OPTS)
+ifeq ($(with_icu),yes)
+EXTRA_TESTS += collate.icu
+endif
 
 check: all tablespace-setup
 	$(pg_regress_check) $(REGRESS_OPTS) --schedule=$(srcdir)/parallel_schedule $(MAXCONNOPT) $(EXTRA_TESTS)
diff --git a/src/test/regress/expected/collate.icu.out b/src/test/regress/expected/collate.icu.out
new file mode 100644
index 0000000000..6787a9638a
--- /dev/null
+++ b/src/test/regress/expected/collate.icu.out
@@ -0,0 +1,1107 @@
+/*
+ * This test is for ICU collations.
+ */
+SET client_encoding TO UTF8;
+CREATE TABLE collate_test1 (
+    a int,
+    b text COLLATE "en_US%icu" NOT NULL
+);
+\d collate_test1
+           Table "public.collate_test1"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+ b      | text    | en_US%icu | not null | 
+
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "ja_JP.eucjp%icu"
+);
+ERROR:  collation "ja_JP.eucjp%icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "ja_JP.eucjp%icu"
+                   ^
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "foo%icu"
+);
+ERROR:  collation "foo%icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "foo%icu"
+                   ^
+CREATE TABLE collate_test_fail (
+    a int COLLATE "en_US%icu",
+    b text
+);
+ERROR:  collations are not supported by type integer
+LINE 2:     a int COLLATE "en_US%icu",
+                  ^
+CREATE TABLE collate_test_like (
+    LIKE collate_test1
+);
+\d collate_test_like
+         Table "public.collate_test_like"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+ b      | text    | en_US%icu | not null | 
+
+CREATE TABLE collate_test2 (
+    a int,
+    b text COLLATE "sv_SE%icu"
+);
+CREATE TABLE collate_test3 (
+    a int,
+    b text COLLATE "C"
+);
+INSERT INTO collate_test1 VALUES (1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');
+INSERT INTO collate_test2 SELECT * FROM collate_test1;
+INSERT INTO collate_test3 SELECT * FROM collate_test1;
+SELECT * FROM collate_test1 WHERE b >= 'bbc';
+ a |  b  
+---+-----
+ 3 | bbc
+(1 row)
+
+SELECT * FROM collate_test2 WHERE b >= 'bbc';
+ a |  b  
+---+-----
+ 2 | äbc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test3 WHERE b >= 'bbc';
+ a |  b  
+---+-----
+ 2 | äbc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test3 WHERE b >= 'BBC';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+(3 rows)
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+ a |  b  
+---+-----
+ 2 | äbc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b >= 'bbc' COLLATE "C";
+ a |  b  
+---+-----
+ 2 | äbc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
+ a |  b  
+---+-----
+ 2 | äbc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US%icu";
+ERROR:  collation mismatch between explicit collations "C" and "en_US%icu"
+LINE 1: ...* FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "e...
+                                                             ^
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE%icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE%icu"; -- fails
+ERROR:  collations are not supported by type integer
+CREATE TABLE collate_test4 (
+    a int,
+    b testdomain_sv
+);
+INSERT INTO collate_test4 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test4 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+CREATE TABLE collate_test5 (
+    a int,
+    b testdomain_sv COLLATE "en_US%icu"
+);
+INSERT INTO collate_test5 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test5 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, b FROM collate_test1 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, b FROM collate_test2 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b FROM collate_test3 ORDER BY b;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+-- star expansion
+SELECT * FROM collate_test1 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT * FROM collate_test2 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT * FROM collate_test3 ORDER BY b;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+-- constant expression folding
+SELECT 'bbc' COLLATE "en_US%icu" > 'äbc' COLLATE "en_US%icu" AS "true";
+ true 
+------
+ t
+(1 row)
+
+SELECT 'bbc' COLLATE "sv_SE%icu" > 'äbc' COLLATE "sv_SE%icu" AS "false";
+ false 
+-------
+ f
+(1 row)
+
+-- upper/lower
+CREATE TABLE collate_test10 (
+    a int,
+    x text COLLATE "en_US%icu",
+    y text COLLATE "tr_TR%icu"
+);
+INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
+SELECT a, lower(x), lower(y), upper(x), upper(y), initcap(x), initcap(y) FROM collate_test10;
+ a | lower | lower | upper | upper | initcap | initcap 
+---+-------+-------+-------+-------+---------+---------
+ 1 | hij   | hij   | HIJ   | HİJ   | Hij     | Hij
+ 2 | hij   | hıj   | HIJ   | HIJ   | Hij     | Hıj
+(2 rows)
+
+SELECT a, lower(x COLLATE "C"), lower(y COLLATE "C") FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hij
+(2 rows)
+
+SELECT a, x, y FROM collate_test10 ORDER BY lower(y), a;
+ a |  x  |  y  
+---+-----+-----
+ 2 | HIJ | HIJ
+ 1 | hij | hij
+(2 rows)
+
+-- LIKE/ILIKE
+SELECT * FROM collate_test1 WHERE b LIKE 'abc';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b LIKE 'abc%';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b LIKE '%bc%';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+(3 rows)
+
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+ 4 | ABC
+(4 rows)
+
+SELECT 'Türkiye' COLLATE "en_US%icu" ILIKE '%KI%' AS "true";
+ true 
+------
+ t
+(1 row)
+
+SELECT 'Türkiye' COLLATE "tr_TR%icu" ILIKE '%KI%' AS "false";
+ false 
+-------
+ f
+(1 row)
+
+SELECT 'bıt' ILIKE 'BIT' COLLATE "en_US%icu" AS "false";
+ false 
+-------
+ f
+(1 row)
+
+SELECT 'bıt' ILIKE 'BIT' COLLATE "tr_TR%icu" AS "true";
+ true 
+------
+ t
+(1 row)
+
+-- The following actually exercises the selectivity estimation for ILIKE.
+SELECT relname FROM pg_class WHERE relname ILIKE 'abc%';
+ relname 
+---------
+(0 rows)
+
+-- regular expressions
+SELECT * FROM collate_test1 WHERE b ~ '^abc$';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b ~ '^abc';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b ~ 'bc';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+(3 rows)
+
+SELECT * FROM collate_test1 WHERE b ~* '^abc$';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ~* '^abc';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ~* 'bc';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+ 4 | ABC
+(4 rows)
+
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US%icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+  b  | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space 
+-----+----------+----------+----------+----------+----------+----------+----------+----------+----------
+ abc | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ABC | t        | t        | f        | f        | t        | t        | t        | f        | f
+ 123 | f        | f        | f        | t        | t        | t        | t        | f        | f
+ ab1 | f        | f        | f        | f        | t        | t        | t        | f        | f
+ a1! | f        | f        | f        | f        | f        | t        | t        | f        | f
+ a c | f        | f        | f        | f        | f        | f        | t        | f        | f
+ !.; | f        | f        | f        | f        | f        | t        | t        | t        | f
+     | f        | f        | f        | f        | f        | f        | t        | f        | t
+ äbç | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ÄBÇ | t        | t        | f        | f        | t        | t        | t        | f        | f
+(10 rows)
+
+SELECT 'Türkiye' COLLATE "en_US%icu" ~* 'KI' AS "true";
+ true 
+------
+ t
+(1 row)
+
+SELECT 'Türkiye' COLLATE "tr_TR%icu" ~* 'KI' AS "true";  -- true with ICU
+ true 
+------
+ t
+(1 row)
+
+SELECT 'bıt' ~* 'BIT' COLLATE "en_US%icu" AS "false";
+ false 
+-------
+ f
+(1 row)
+
+SELECT 'bıt' ~* 'BIT' COLLATE "tr_TR%icu" AS "false";  -- false with ICU
+ false 
+-------
+ f
+(1 row)
+
+-- The following actually exercises the selectivity estimation for ~*.
+SELECT relname FROM pg_class WHERE relname ~* '^abc';
+ relname 
+---------
+(0 rows)
+
+-- to_char
+SET lc_time TO 'tr_TR';
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
+   to_char   
+-------------
+ 01 NIS 2010
+(1 row)
+
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR%icu");
+   to_char   
+-------------
+ 01 NİS 2010
+(1 row)
+
+-- backwards parsing
+CREATE VIEW collview1 AS SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+CREATE VIEW collview2 AS SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+CREATE VIEW collview3 AS SELECT a, lower((x || x) COLLATE "C") FROM collate_test10;
+SELECT table_name, view_definition FROM information_schema.views
+  WHERE table_name LIKE 'collview%' ORDER BY 1;
+ table_name |                             view_definition                              
+------------+--------------------------------------------------------------------------
+ collview1  |  SELECT collate_test1.a,                                                +
+            |     collate_test1.b                                                     +
+            |    FROM collate_test1                                                   +
+            |   WHERE ((collate_test1.b COLLATE "C") >= 'bbc'::text);
+ collview2  |  SELECT collate_test1.a,                                                +
+            |     collate_test1.b                                                     +
+            |    FROM collate_test1                                                   +
+            |   ORDER BY (collate_test1.b COLLATE "C");
+ collview3  |  SELECT collate_test10.a,                                               +
+            |     lower(((collate_test10.x || collate_test10.x) COLLATE "C")) AS lower+
+            |    FROM collate_test10;
+(3 rows)
+
+-- collation propagation in various expression types
+SELECT a, coalesce(b, 'foo') FROM collate_test1 ORDER BY 2;
+ a | coalesce 
+---+----------
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, coalesce(b, 'foo') FROM collate_test2 ORDER BY 2;
+ a | coalesce 
+---+----------
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, coalesce(b, 'foo') FROM collate_test3 ORDER BY 2;
+ a | coalesce 
+---+----------
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, lower(coalesce(x, 'foo')), lower(coalesce(y, 'foo')) FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hıj
+(2 rows)
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;
+ a |  b  | greatest 
+---+-----+----------
+ 1 | abc | CCC
+ 2 | äbc | CCC
+ 3 | bbc | CCC
+ 4 | ABC | CCC
+(4 rows)
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test2 ORDER BY 3;
+ a |  b  | greatest 
+---+-----+----------
+ 1 | abc | CCC
+ 3 | bbc | CCC
+ 4 | ABC | CCC
+ 2 | äbc | äbc
+(4 rows)
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test3 ORDER BY 3;
+ a |  b  | greatest 
+---+-----+----------
+ 4 | ABC | CCC
+ 1 | abc | abc
+ 3 | bbc | bbc
+ 2 | äbc | äbc
+(4 rows)
+
+SELECT a, x, y, lower(greatest(x, 'foo')), lower(greatest(y, 'foo')) FROM collate_test10;
+ a |  x  |  y  | lower | lower 
+---+-----+-----+-------+-------
+ 1 | hij | hij | hij   | hij
+ 2 | HIJ | HIJ | hij   | hıj
+(2 rows)
+
+SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;
+ a | nullif 
+---+--------
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+ 1 | 
+(4 rows)
+
+SELECT a, nullif(b, 'abc') FROM collate_test2 ORDER BY 2;
+ a | nullif 
+---+--------
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+ 1 | 
+(4 rows)
+
+SELECT a, nullif(b, 'abc') FROM collate_test3 ORDER BY 2;
+ a | nullif 
+---+--------
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+ 1 | 
+(4 rows)
+
+SELECT a, lower(nullif(x, 'foo')), lower(nullif(y, 'foo')) FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hıj
+(2 rows)
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test1 ORDER BY 2;
+ a |  b   
+---+------
+ 4 | ABC
+ 2 | äbc
+ 1 | abcd
+ 3 | bbc
+(4 rows)
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test2 ORDER BY 2;
+ a |  b   
+---+------
+ 4 | ABC
+ 1 | abcd
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test3 ORDER BY 2;
+ a |  b   
+---+------
+ 4 | ABC
+ 1 | abcd
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+CREATE DOMAIN testdomain AS text;
+SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, b::testdomain FROM collate_test2 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b::testdomain FROM collate_test3 ORDER BY 2;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b::testdomain_sv FROM collate_test3 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, lower(x::testdomain), lower(y::testdomain) FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hıj
+(2 rows)
+
+SELECT min(b), max(b) FROM collate_test1;
+ min | max 
+-----+-----
+ abc | bbc
+(1 row)
+
+SELECT min(b), max(b) FROM collate_test2;
+ min | max 
+-----+-----
+ abc | äbc
+(1 row)
+
+SELECT min(b), max(b) FROM collate_test3;
+ min | max 
+-----+-----
+ ABC | äbc
+(1 row)
+
+SELECT array_agg(b ORDER BY b) FROM collate_test1;
+     array_agg     
+-------------------
+ {abc,ABC,äbc,bbc}
+(1 row)
+
+SELECT array_agg(b ORDER BY b) FROM collate_test2;
+     array_agg     
+-------------------
+ {abc,ABC,bbc,äbc}
+(1 row)
+
+SELECT array_agg(b ORDER BY b) FROM collate_test3;
+     array_agg     
+-------------------
+ {ABC,abc,bbc,äbc}
+(1 row)
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test1 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 1 | abc
+ 4 | ABC
+ 4 | ABC
+ 2 | äbc
+ 2 | äbc
+ 3 | bbc
+ 3 | bbc
+(8 rows)
+
+SELECT a, b FROM collate_test2 UNION SELECT a, b FROM collate_test2 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b FROM collate_test3 WHERE a < 4 INTERSECT SELECT a, b FROM collate_test3 WHERE a > 1 ORDER BY 2;
+ a |  b  
+---+-----
+ 3 | bbc
+ 2 | äbc
+(2 rows)
+
+SELECT a, b FROM collate_test3 EXCEPT SELECT a, b FROM collate_test3 WHERE a < 2 ORDER BY 2;
+ a |  b  
+---+-----
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(3 rows)
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  could not determine which collation to use for string comparison
+HINT:  Use the COLLATE clause to set the collation explicitly.
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- ok
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+ 4 | ABC
+ 1 | abc
+ 2 | äbc
+ 3 | bbc
+ 4 | ABC
+(8 rows)
+
+SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "C"
+LINE 1: SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collat...
+                                                       ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+SELECT a, b COLLATE "C" FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- ok
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "C"
+LINE 1: ...ELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM col...
+                                                             ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "C"
+LINE 1: SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM colla...
+                                                        ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- fail
+ERROR:  no collation was derived for column "b" with collatable type text
+HINT:  Use the COLLATE clause to set the collation explicitly.
+-- ideally this would be a parse-time error, but for now it must be run-time:
+select x < y from collate_test10; -- fail
+ERROR:  could not determine which collation to use for string comparison
+HINT:  Use the COLLATE clause to set the collation explicitly.
+select x || y from collate_test10; -- ok, because || is not collation aware
+ ?column? 
+----------
+ hijhij
+ HIJHIJ
+(2 rows)
+
+select x, y from collate_test10 order by x || y; -- not so ok
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "tr_TR%icu"
+LINE 1: select x, y from collate_test10 order by x || y;
+                                                      ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+-- collation mismatch between recursive and non-recursive term
+WITH RECURSIVE foo(x) AS
+   (SELECT x FROM (VALUES('a' COLLATE "en_US%icu"),('b')) t(x)
+   UNION ALL
+   SELECT (x || 'c') COLLATE "de_DE%icu" FROM foo WHERE length(x) < 10)
+SELECT * FROM foo;
+ERROR:  recursive query "foo" column 1 has collation "en_US%icu" in non-recursive term but collation "de_DE%icu" overall
+LINE 2:    (SELECT x FROM (VALUES('a' COLLATE "en_US%icu"),('b')) t(...
+                   ^
+HINT:  Use the COLLATE clause to set the collation of the non-recursive term.
+-- casting
+SELECT CAST('42' AS text COLLATE "C");
+ERROR:  syntax error at or near "COLLATE"
+LINE 1: SELECT CAST('42' AS text COLLATE "C");
+                                 ^
+SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, CAST(b AS varchar) FROM collate_test2 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, CAST(b AS varchar) FROM collate_test3 ORDER BY 2;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+-- propagation of collation in SQL functions (inlined and non-inlined cases)
+-- and plpgsql functions too
+CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 $$;
+CREATE FUNCTION mylt_noninline (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 limit 1 $$;
+CREATE FUNCTION mylt_plpgsql (text, text) RETURNS boolean LANGUAGE plpgsql
+    AS $$ begin return $1 < $2; end $$;
+SELECT a.b AS a, b.b AS b, a.b < b.b AS lt,
+       mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b)
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+  a  |  b  | lt | mylt | mylt_noninline | mylt_plpgsql 
+-----+-----+----+------+----------------+--------------
+ abc | abc | f  | f    | f              | f
+ abc | ABC | t  | t    | t              | t
+ abc | äbc | t  | t    | t              | t
+ abc | bbc | t  | t    | t              | t
+ ABC | abc | f  | f    | f              | f
+ ABC | ABC | f  | f    | f              | f
+ ABC | äbc | t  | t    | t              | t
+ ABC | bbc | t  | t    | t              | t
+ äbc | abc | f  | f    | f              | f
+ äbc | ABC | f  | f    | f              | f
+ äbc | äbc | f  | f    | f              | f
+ äbc | bbc | t  | t    | t              | t
+ bbc | abc | f  | f    | f              | f
+ bbc | ABC | f  | f    | f              | f
+ bbc | äbc | f  | f    | f              | f
+ bbc | bbc | f  | f    | f              | f
+(16 rows)
+
+SELECT a.b AS a, b.b AS b, a.b < b.b COLLATE "C" AS lt,
+       mylt(a.b, b.b COLLATE "C"), mylt_noninline(a.b, b.b COLLATE "C"),
+       mylt_plpgsql(a.b, b.b COLLATE "C")
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+  a  |  b  | lt | mylt | mylt_noninline | mylt_plpgsql 
+-----+-----+----+------+----------------+--------------
+ abc | abc | f  | f    | f              | f
+ abc | ABC | f  | f    | f              | f
+ abc | äbc | t  | t    | t              | t
+ abc | bbc | t  | t    | t              | t
+ ABC | abc | t  | t    | t              | t
+ ABC | ABC | f  | f    | f              | f
+ ABC | äbc | t  | t    | t              | t
+ ABC | bbc | t  | t    | t              | t
+ äbc | abc | f  | f    | f              | f
+ äbc | ABC | f  | f    | f              | f
+ äbc | äbc | f  | f    | f              | f
+ äbc | bbc | f  | f    | f              | f
+ bbc | abc | f  | f    | f              | f
+ bbc | ABC | f  | f    | f              | f
+ bbc | äbc | t  | t    | t              | t
+ bbc | bbc | f  | f    | f              | f
+(16 rows)
+
+-- collation override in plpgsql
+CREATE FUNCTION mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+SELECT mylt2('a', 'B' collate "en_US%icu") as t, mylt2('a', 'B' collate "C") as f;
+ t | f 
+---+---
+ t | f
+(1 row)
+
+CREATE OR REPLACE FUNCTION
+  mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text COLLATE "POSIX" := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+SELECT mylt2('a', 'B') as f;
+ f 
+---
+ f
+(1 row)
+
+SELECT mylt2('a', 'B' collate "C") as fail; -- conflicting collations
+ERROR:  could not determine which collation to use for string comparison
+HINT:  Use the COLLATE clause to set the collation explicitly.
+CONTEXT:  PL/pgSQL function mylt2(text,text) line 6 at RETURN
+SELECT mylt2('a', 'B' collate "POSIX") as f;
+ f 
+---
+ f
+(1 row)
+
+-- polymorphism
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test1)) ORDER BY 1;
+ unnest 
+--------
+ abc
+ ABC
+ äbc
+ bbc
+(4 rows)
+
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test2)) ORDER BY 1;
+ unnest 
+--------
+ abc
+ ABC
+ bbc
+ äbc
+(4 rows)
+
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test3)) ORDER BY 1;
+ unnest 
+--------
+ ABC
+ abc
+ bbc
+ äbc
+(4 rows)
+
+CREATE FUNCTION dup (anyelement) RETURNS anyelement
+    AS 'select $1' LANGUAGE sql;
+SELECT a, dup(b) FROM collate_test1 ORDER BY 2;
+ a | dup 
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | äbc
+ 3 | bbc
+(4 rows)
+
+SELECT a, dup(b) FROM collate_test2 ORDER BY 2;
+ a | dup 
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+SELECT a, dup(b) FROM collate_test3 ORDER BY 2;
+ a | dup 
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | äbc
+(4 rows)
+
+-- indexes
+CREATE INDEX collate_test1_idx1 ON collate_test1 (b);
+CREATE INDEX collate_test1_idx2 ON collate_test1 (b COLLATE "C");
+CREATE INDEX collate_test1_idx3 ON collate_test1 ((b COLLATE "C")); -- this is different grammatically
+CREATE INDEX collate_test1_idx4 ON collate_test1 (((b||'foo') COLLATE "POSIX"));
+CREATE INDEX collate_test1_idx5 ON collate_test1 (a COLLATE "C"); -- fail
+ERROR:  collations are not supported by type integer
+CREATE INDEX collate_test1_idx6 ON collate_test1 ((a COLLATE "C")); -- fail
+ERROR:  collations are not supported by type integer
+LINE 1: ...ATE INDEX collate_test1_idx6 ON collate_test1 ((a COLLATE "C...
+                                                             ^
+SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_test%_idx%' ORDER BY 1;
+      relname       |                                           pg_get_indexdef                                           
+--------------------+-----------------------------------------------------------------------------------------------------
+ collate_test1_idx1 | CREATE INDEX collate_test1_idx1 ON collate_test1 USING btree (b)
+ collate_test1_idx2 | CREATE INDEX collate_test1_idx2 ON collate_test1 USING btree (b COLLATE "C")
+ collate_test1_idx3 | CREATE INDEX collate_test1_idx3 ON collate_test1 USING btree (b COLLATE "C")
+ collate_test1_idx4 | CREATE INDEX collate_test1_idx4 ON collate_test1 USING btree (((b || 'foo'::text)) COLLATE "POSIX")
+(4 rows)
+
+-- schema manipulation commands
+CREATE ROLE regress_test_role;
+CREATE SCHEMA test_schema;
+-- We need to do this this way to cope with varying names for encodings:
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
+          quote_literal(current_setting('lc_collate')) || ');';
+END
+$$;
+CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
+ERROR:  collation "test0" already exists
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
+          quote_literal(current_setting('lc_collate')) ||
+          ', lc_ctype = ' ||
+          quote_literal(current_setting('lc_ctype')) || ');';
+END
+$$;
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+ERROR:  parameter "lc_ctype" must be specified
+--CREATE COLLATION testx (provider = icu, locale = 'nonsense'); -- never fails
+CREATE COLLATION test4 FROM nonsense;
+ERROR:  collation "nonsense" for encoding "UTF8" does not exist
+CREATE COLLATION test5 FROM test0;
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
+ collname 
+----------
+ test0
+ test1
+ test5
+(3 rows)
+
+ALTER COLLATION test1 RENAME TO test11;
+ALTER COLLATION test0 RENAME TO test11; -- fail
+ERROR:  collation "test11" already exists in schema "public"
+ALTER COLLATION test1 RENAME TO test22; -- fail
+ERROR:  collation "test1" for encoding "UTF8" does not exist
+ALTER COLLATION test11 OWNER TO regress_test_role;
+ALTER COLLATION test11 OWNER TO nonsense;
+ERROR:  role "nonsense" does not exist
+ALTER COLLATION test11 SET SCHEMA test_schema;
+COMMENT ON COLLATION test0 IS 'US English';
+SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
+    FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
+    WHERE collname LIKE 'test%'
+    ORDER BY 1;
+ collname |   nspname   | obj_description 
+----------+-------------+-----------------
+ test0    | public      | US English
+ test11   | test_schema | 
+ test5    | public      | 
+(3 rows)
+
+DROP COLLATION test0, test_schema.test11, test5;
+DROP COLLATION test0; -- fail
+ERROR:  collation "test0" for encoding "UTF8" does not exist
+DROP COLLATION IF EXISTS test0;
+NOTICE:  collation "test0" does not exist, skipping
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
+ collname 
+----------
+(0 rows)
+
+DROP SCHEMA test_schema;
+DROP ROLE regress_test_role;
+-- ALTER
+ALTER COLLATION "en_US%icu" REFRESH VERSION;
+NOTICE:  version has not changed
+-- dependencies
+CREATE COLLATION test0 FROM "C";
+CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
+CREATE DOMAIN collate_dep_dom1 AS text COLLATE test0;
+CREATE TYPE collate_dep_test2 AS (x int, y text COLLATE test0);
+CREATE VIEW collate_dep_test3 AS SELECT text 'foo' COLLATE test0 AS foo;
+CREATE TABLE collate_dep_test4t (a int, b text);
+CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
+DROP COLLATION test0 RESTRICT; -- fail
+ERROR:  cannot drop collation test0 because other objects depend on it
+DETAIL:  table collate_dep_test1 column b depends on collation test0
+type collate_dep_dom1 depends on collation test0
+composite type collate_dep_test2 column y depends on collation test0
+view collate_dep_test3 depends on collation test0
+index collate_dep_test4i depends on collation test0
+HINT:  Use DROP ... CASCADE to drop the dependent objects too.
+DROP COLLATION test0 CASCADE;
+NOTICE:  drop cascades to 5 other objects
+DETAIL:  drop cascades to table collate_dep_test1 column b
+drop cascades to type collate_dep_dom1
+drop cascades to composite type collate_dep_test2 column y
+drop cascades to view collate_dep_test3
+drop cascades to index collate_dep_test4i
+\d collate_dep_test1
+         Table "public.collate_dep_test1"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+
+\d collate_dep_test2
+     Composite type "public.collate_dep_test2"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ x      | integer |           |          | 
+
+DROP TABLE collate_dep_test1, collate_dep_test4t;
+DROP TYPE collate_dep_test2;
+-- test range types and collations
+create type textrange_c as range(subtype=text, collation="C");
+create type textrange_en_us as range(subtype=text, collation="en_US%icu");
+select textrange_c('A','Z') @> 'b'::text;
+ ?column? 
+----------
+ f
+(1 row)
+
+select textrange_en_us('A','Z') @> 'b'::text;
+ ?column? 
+----------
+ t
+(1 row)
+
+drop type textrange_c;
+drop type textrange_en_us;
diff --git a/src/test/regress/sql/collate.icu.sql b/src/test/regress/sql/collate.icu.sql
new file mode 100644
index 0000000000..3dcabbb3fd
--- /dev/null
+++ b/src/test/regress/sql/collate.icu.sql
@@ -0,0 +1,420 @@
+/*
+ * This test is for ICU collations.
+ */
+
+SET client_encoding TO UTF8;
+
+
+CREATE TABLE collate_test1 (
+    a int,
+    b text COLLATE "en_US%icu" NOT NULL
+);
+
+\d collate_test1
+
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "ja_JP.eucjp%icu"
+);
+
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "foo%icu"
+);
+
+CREATE TABLE collate_test_fail (
+    a int COLLATE "en_US%icu",
+    b text
+);
+
+CREATE TABLE collate_test_like (
+    LIKE collate_test1
+);
+
+\d collate_test_like
+
+CREATE TABLE collate_test2 (
+    a int,
+    b text COLLATE "sv_SE%icu"
+);
+
+CREATE TABLE collate_test3 (
+    a int,
+    b text COLLATE "C"
+);
+
+INSERT INTO collate_test1 VALUES (1, 'abc'), (2, 'äbc'), (3, 'bbc'), (4, 'ABC');
+INSERT INTO collate_test2 SELECT * FROM collate_test1;
+INSERT INTO collate_test3 SELECT * FROM collate_test1;
+
+SELECT * FROM collate_test1 WHERE b >= 'bbc';
+SELECT * FROM collate_test2 WHERE b >= 'bbc';
+SELECT * FROM collate_test3 WHERE b >= 'bbc';
+SELECT * FROM collate_test3 WHERE b >= 'BBC';
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+SELECT * FROM collate_test1 WHERE b >= 'bbc' COLLATE "C";
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US%icu";
+
+
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE%icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE%icu"; -- fails
+CREATE TABLE collate_test4 (
+    a int,
+    b testdomain_sv
+);
+INSERT INTO collate_test4 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test4 ORDER BY b;
+
+CREATE TABLE collate_test5 (
+    a int,
+    b testdomain_sv COLLATE "en_US%icu"
+);
+INSERT INTO collate_test5 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test5 ORDER BY b;
+
+
+SELECT a, b FROM collate_test1 ORDER BY b;
+SELECT a, b FROM collate_test2 ORDER BY b;
+SELECT a, b FROM collate_test3 ORDER BY b;
+
+SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+
+-- star expansion
+SELECT * FROM collate_test1 ORDER BY b;
+SELECT * FROM collate_test2 ORDER BY b;
+SELECT * FROM collate_test3 ORDER BY b;
+
+-- constant expression folding
+SELECT 'bbc' COLLATE "en_US%icu" > 'äbc' COLLATE "en_US%icu" AS "true";
+SELECT 'bbc' COLLATE "sv_SE%icu" > 'äbc' COLLATE "sv_SE%icu" AS "false";
+
+-- upper/lower
+
+CREATE TABLE collate_test10 (
+    a int,
+    x text COLLATE "en_US%icu",
+    y text COLLATE "tr_TR%icu"
+);
+
+INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
+
+SELECT a, lower(x), lower(y), upper(x), upper(y), initcap(x), initcap(y) FROM collate_test10;
+SELECT a, lower(x COLLATE "C"), lower(y COLLATE "C") FROM collate_test10;
+
+SELECT a, x, y FROM collate_test10 ORDER BY lower(y), a;
+
+-- LIKE/ILIKE
+
+SELECT * FROM collate_test1 WHERE b LIKE 'abc';
+SELECT * FROM collate_test1 WHERE b LIKE 'abc%';
+SELECT * FROM collate_test1 WHERE b LIKE '%bc%';
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc';
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';
+SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
+
+SELECT 'Türkiye' COLLATE "en_US%icu" ILIKE '%KI%' AS "true";
+SELECT 'Türkiye' COLLATE "tr_TR%icu" ILIKE '%KI%' AS "false";
+
+SELECT 'bıt' ILIKE 'BIT' COLLATE "en_US%icu" AS "false";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "tr_TR%icu" AS "true";
+
+-- The following actually exercises the selectivity estimation for ILIKE.
+SELECT relname FROM pg_class WHERE relname ILIKE 'abc%';
+
+-- regular expressions
+
+SELECT * FROM collate_test1 WHERE b ~ '^abc$';
+SELECT * FROM collate_test1 WHERE b ~ '^abc';
+SELECT * FROM collate_test1 WHERE b ~ 'bc';
+SELECT * FROM collate_test1 WHERE b ~* '^abc$';
+SELECT * FROM collate_test1 WHERE b ~* '^abc';
+SELECT * FROM collate_test1 WHERE b ~* 'bc';
+
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US%icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+
+SELECT 'Türkiye' COLLATE "en_US%icu" ~* 'KI' AS "true";
+SELECT 'Türkiye' COLLATE "tr_TR%icu" ~* 'KI' AS "true";  -- true with ICU
+
+SELECT 'bıt' ~* 'BIT' COLLATE "en_US%icu" AS "false";
+SELECT 'bıt' ~* 'BIT' COLLATE "tr_TR%icu" AS "false";  -- false with ICU
+
+-- The following actually exercises the selectivity estimation for ~*.
+SELECT relname FROM pg_class WHERE relname ~* '^abc';
+
+
+-- to_char
+
+SET lc_time TO 'tr_TR';
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR%icu");
+
+
+-- backwards parsing
+
+CREATE VIEW collview1 AS SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+CREATE VIEW collview2 AS SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+CREATE VIEW collview3 AS SELECT a, lower((x || x) COLLATE "C") FROM collate_test10;
+
+SELECT table_name, view_definition FROM information_schema.views
+  WHERE table_name LIKE 'collview%' ORDER BY 1;
+
+
+-- collation propagation in various expression types
+
+SELECT a, coalesce(b, 'foo') FROM collate_test1 ORDER BY 2;
+SELECT a, coalesce(b, 'foo') FROM collate_test2 ORDER BY 2;
+SELECT a, coalesce(b, 'foo') FROM collate_test3 ORDER BY 2;
+SELECT a, lower(coalesce(x, 'foo')), lower(coalesce(y, 'foo')) FROM collate_test10;
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;
+SELECT a, b, greatest(b, 'CCC') FROM collate_test2 ORDER BY 3;
+SELECT a, b, greatest(b, 'CCC') FROM collate_test3 ORDER BY 3;
+SELECT a, x, y, lower(greatest(x, 'foo')), lower(greatest(y, 'foo')) FROM collate_test10;
+
+SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;
+SELECT a, nullif(b, 'abc') FROM collate_test2 ORDER BY 2;
+SELECT a, nullif(b, 'abc') FROM collate_test3 ORDER BY 2;
+SELECT a, lower(nullif(x, 'foo')), lower(nullif(y, 'foo')) FROM collate_test10;
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test1 ORDER BY 2;
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test2 ORDER BY 2;
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test3 ORDER BY 2;
+
+CREATE DOMAIN testdomain AS text;
+SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;
+SELECT a, b::testdomain FROM collate_test2 ORDER BY 2;
+SELECT a, b::testdomain FROM collate_test3 ORDER BY 2;
+SELECT a, b::testdomain_sv FROM collate_test3 ORDER BY 2;
+SELECT a, lower(x::testdomain), lower(y::testdomain) FROM collate_test10;
+
+SELECT min(b), max(b) FROM collate_test1;
+SELECT min(b), max(b) FROM collate_test2;
+SELECT min(b), max(b) FROM collate_test3;
+
+SELECT array_agg(b ORDER BY b) FROM collate_test1;
+SELECT array_agg(b ORDER BY b) FROM collate_test2;
+SELECT array_agg(b ORDER BY b) FROM collate_test3;
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test1 ORDER BY 2;
+SELECT a, b FROM collate_test2 UNION SELECT a, b FROM collate_test2 ORDER BY 2;
+SELECT a, b FROM collate_test3 WHERE a < 4 INTERSECT SELECT a, b FROM collate_test3 WHERE a > 1 ORDER BY 2;
+SELECT a, b FROM collate_test3 EXCEPT SELECT a, b FROM collate_test3 WHERE a < 2 ORDER BY 2;
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- ok
+SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+SELECT a, b COLLATE "C" FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- ok
+SELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+
+CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- fail
+
+-- ideally this would be a parse-time error, but for now it must be run-time:
+select x < y from collate_test10; -- fail
+select x || y from collate_test10; -- ok, because || is not collation aware
+select x, y from collate_test10 order by x || y; -- not so ok
+
+-- collation mismatch between recursive and non-recursive term
+WITH RECURSIVE foo(x) AS
+   (SELECT x FROM (VALUES('a' COLLATE "en_US%icu"),('b')) t(x)
+   UNION ALL
+   SELECT (x || 'c') COLLATE "de_DE%icu" FROM foo WHERE length(x) < 10)
+SELECT * FROM foo;
+
+
+-- casting
+
+SELECT CAST('42' AS text COLLATE "C");
+
+SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;
+SELECT a, CAST(b AS varchar) FROM collate_test2 ORDER BY 2;
+SELECT a, CAST(b AS varchar) FROM collate_test3 ORDER BY 2;
+
+
+-- propagation of collation in SQL functions (inlined and non-inlined cases)
+-- and plpgsql functions too
+
+CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 $$;
+
+CREATE FUNCTION mylt_noninline (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 limit 1 $$;
+
+CREATE FUNCTION mylt_plpgsql (text, text) RETURNS boolean LANGUAGE plpgsql
+    AS $$ begin return $1 < $2; end $$;
+
+SELECT a.b AS a, b.b AS b, a.b < b.b AS lt,
+       mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b)
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+
+SELECT a.b AS a, b.b AS b, a.b < b.b COLLATE "C" AS lt,
+       mylt(a.b, b.b COLLATE "C"), mylt_noninline(a.b, b.b COLLATE "C"),
+       mylt_plpgsql(a.b, b.b COLLATE "C")
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+
+
+-- collation override in plpgsql
+
+CREATE FUNCTION mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+
+SELECT mylt2('a', 'B' collate "en_US%icu") as t, mylt2('a', 'B' collate "C") as f;
+
+CREATE OR REPLACE FUNCTION
+  mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text COLLATE "POSIX" := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+
+SELECT mylt2('a', 'B') as f;
+SELECT mylt2('a', 'B' collate "C") as fail; -- conflicting collations
+SELECT mylt2('a', 'B' collate "POSIX") as f;
+
+
+-- polymorphism
+
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test1)) ORDER BY 1;
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test2)) ORDER BY 1;
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test3)) ORDER BY 1;
+
+CREATE FUNCTION dup (anyelement) RETURNS anyelement
+    AS 'select $1' LANGUAGE sql;
+
+SELECT a, dup(b) FROM collate_test1 ORDER BY 2;
+SELECT a, dup(b) FROM collate_test2 ORDER BY 2;
+SELECT a, dup(b) FROM collate_test3 ORDER BY 2;
+
+
+-- indexes
+
+CREATE INDEX collate_test1_idx1 ON collate_test1 (b);
+CREATE INDEX collate_test1_idx2 ON collate_test1 (b COLLATE "C");
+CREATE INDEX collate_test1_idx3 ON collate_test1 ((b COLLATE "C")); -- this is different grammatically
+CREATE INDEX collate_test1_idx4 ON collate_test1 (((b||'foo') COLLATE "POSIX"));
+
+CREATE INDEX collate_test1_idx5 ON collate_test1 (a COLLATE "C"); -- fail
+CREATE INDEX collate_test1_idx6 ON collate_test1 ((a COLLATE "C")); -- fail
+
+SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_test%_idx%' ORDER BY 1;
+
+
+-- schema manipulation commands
+
+CREATE ROLE regress_test_role;
+CREATE SCHEMA test_schema;
+
+-- We need to do this this way to cope with varying names for encodings:
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
+          quote_literal(current_setting('lc_collate')) || ');';
+END
+$$;
+CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
+          quote_literal(current_setting('lc_collate')) ||
+          ', lc_ctype = ' ||
+          quote_literal(current_setting('lc_ctype')) || ');';
+END
+$$;
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+--CREATE COLLATION testx (provider = icu, locale = 'nonsense'); -- never fails
+
+CREATE COLLATION test4 FROM nonsense;
+CREATE COLLATION test5 FROM test0;
+
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
+
+ALTER COLLATION test1 RENAME TO test11;
+ALTER COLLATION test0 RENAME TO test11; -- fail
+ALTER COLLATION test1 RENAME TO test22; -- fail
+
+ALTER COLLATION test11 OWNER TO regress_test_role;
+ALTER COLLATION test11 OWNER TO nonsense;
+ALTER COLLATION test11 SET SCHEMA test_schema;
+
+COMMENT ON COLLATION test0 IS 'US English';
+
+SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
+    FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
+    WHERE collname LIKE 'test%'
+    ORDER BY 1;
+
+DROP COLLATION test0, test_schema.test11, test5;
+DROP COLLATION test0; -- fail
+DROP COLLATION IF EXISTS test0;
+
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
+
+DROP SCHEMA test_schema;
+DROP ROLE regress_test_role;
+
+
+-- ALTER
+
+ALTER COLLATION "en_US%icu" REFRESH VERSION;
+
+
+-- dependencies
+
+CREATE COLLATION test0 FROM "C";
+
+CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
+CREATE DOMAIN collate_dep_dom1 AS text COLLATE test0;
+CREATE TYPE collate_dep_test2 AS (x int, y text COLLATE test0);
+CREATE VIEW collate_dep_test3 AS SELECT text 'foo' COLLATE test0 AS foo;
+CREATE TABLE collate_dep_test4t (a int, b text);
+CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
+
+DROP COLLATION test0 RESTRICT; -- fail
+DROP COLLATION test0 CASCADE;
+
+\d collate_dep_test1
+\d collate_dep_test2
+
+DROP TABLE collate_dep_test1, collate_dep_test4t;
+DROP TYPE collate_dep_test2;
+
+-- test range types and collations
+
+create type textrange_c as range(subtype=text, collation="C");
+create type textrange_en_us as range(subtype=text, collation="en_US%icu");
+
+select textrange_c('A','Z') @> 'b'::text;
+select textrange_en_us('A','Z') @> 'b'::text;
+
+drop type textrange_c;
+drop type textrange_en_us;
-- 
2.11.0

#46

Peter Geoghegan

pg@heroku.com

about 9 years ago

In reply to: Peter Eisentraut (#45)

Re: ICU integration

On Tue, Dec 27, 2016 at 6:50 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

I don't have much experience with the abbreviated key stuff. I have
filled in what I think should work, but it needs detailed review.

Thanks.

It occurs to me that the comparison caching stuff added by commit
0e57b4d8b needs to be considered here, too. When we had to copy the
string to a temp buffer anyway, in order to add the terminating NUL
byte expected by strcoll(), there was an opportunity to do caching of
comparisons at little additional cost. However, since ICU offers an
interface that you're using that doesn't require any NUL byte, there
is a new trade-off to be considered -- swallow the cost of copying
into our own temp buffer solely for the benefit of comparison caching,
or don't do comparison caching. (Note that glibc had a similar
comparison caching optimization itself at one point, built right into
strcoll(), but it was subsequently disabled.)

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#47

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

about 9 years ago

In reply to: Peter Geoghegan (#46)

Re: ICU integration

On 1/7/17 10:01 PM, Peter Geoghegan wrote:

It occurs to me that the comparison caching stuff added by commit
0e57b4d8b needs to be considered here, too. When we had to copy the
string to a temp buffer anyway, in order to add the terminating NUL
byte expected by strcoll(), there was an opportunity to do caching of
comparisons at little additional cost. However, since ICU offers an
interface that you're using that doesn't require any NUL byte, there
is a new trade-off to be considered -- swallow the cost of copying
into our own temp buffer solely for the benefit of comparison caching,
or don't do comparison caching. (Note that glibc had a similar
comparison caching optimization itself at one point, built right into
strcoll(), but it was subsequently disabled.)

That might be worth looking into, but it seems a bit daunting to
construct a benchmark specifically for this, unless we have the one that
was originally used lying around somewhere.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#48

Peter Geoghegan

pg@heroku.com

about 9 years ago

In reply to: Peter Eisentraut (#47)

Re: ICU integration

On Mon, Jan 9, 2017 at 12:25 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 1/7/17 10:01 PM, Peter Geoghegan wrote:

It occurs to me that the comparison caching stuff added by commit
0e57b4d8b needs to be considered here, too.

That might be worth looking into, but it seems a bit daunting to
construct a benchmark specifically for this, unless we have the one that
was originally used lying around somewhere.

The benchmark used when 0e57b4d8b went in only had to prove that there
was no measurable overhead when the optimization didn't help (It was
quite clear that it was worthwhile in good cases). I think that
comparison caching will continue to be about as effective as before in
good cases, but you don't do comparison caching anymore. That might be
fine, but let's be sure that that's the right trade-off.

To demonstrate the effectiveness of the patch, I used this cities sample data:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/data/cities.dump

Test query: select country, province, count(*) from cities group by
rollup (country, province);

This was shown to be about 25% faster, although that was with
abbreviated keys (plus caching of abbreviated keys), and not just the
comparison caching optimization.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#49

Peter Geoghegan

pg@heroku.com

about 9 years ago

In reply to: Peter Eisentraut (#45)

Re: ICU integration

On Tue, Dec 27, 2016 at 6:50 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Updated patch attached.

Some more things I noticed following another quick read over the patch:

* I think it's worth looking into ucol_nextSortKeyPart(), and using
that as an alternative to ucol_getSortKey(). It doesn't seem any
harder, and when I tested it it was clearly faster. (I think that
ucol_nextSortKeyPart() is more or less intended to be used for
abbreviated keys.)

* I think that it's not okay that convert_string_datum() still uses
strxfrm() without considering if it's an ICU build. That's why I
raised the idea of a pg_strxfrm() wrapper at one point.

* Similarly, I think that check_strxfrm_bug() should have something
about ICU. It's looking for a particular bug in some very old version
of Solaris 8. At a minimum, check_strxfrm_bug() should now not run at
all (a broken OS strxfrm() shouldn't be a problem with ICU).
Otherwise, it could do some kind of testing on our pg_strxfrm()
wrapper (or similar).

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#50

Pavel Stehule

pavel.stehule@gmail.com

almost 9 years ago

In reply to: Peter Eisentraut (#45)

Re: ICU integration

2016-12-28 3:50 GMT+01:00 Peter Eisentraut <peter.eisentraut@2ndquadrant.com

:

Updated patch attached.

The previous round of reviews showed that there was general agreement on
the approach. So I have focused on filling in the gaps, added ICU
support to all the locale-using places, added documentation, fixed some
minor issues that have been pointed out. Everything appears to work
correctly now, and the user-facing feature set is pretty well-rounded.

I don't have much experience with the abbreviated key stuff. I have
filled in what I think should work, but it needs detailed review.

Similarly, some of the stuff in the regular expression code was hacked
in blindly.

One minor problem is that we now support adding any-encoding collation
entries, which violates some of the comments in CollationCreate(). See
FIXME there. It doesn't seem worth major contortions to fix that; maybe
it just has to be documented better.

the regress test fails

Program received signal SIGSEGV, Segmentation fault.
0x00000000007bbc2b in pattern_char_isalpha (locale_is_c=0 '\000',
locale=0x1a73220, is_multibyte=1 '\001', c=97 'a') at selfuncs.c:5291
5291 return isalpha_l((unsigned char) c, locale->lt);

(gdb) bt
#0 0x00000000007bbc2b in pattern_char_isalpha (locale_is_c=0 '\000',
locale=0x1a73220, is_multibyte=1 '\001', c=97 'a') at selfuncs.c:5291
#1 like_fixed_prefix (patt_const=<optimized out>,
case_insensitive=<optimized out>, collation=<optimized out>,
prefix_const=0x7ffc0963e800,
rest_selec=0x7ffc0963e808) at selfuncs.c:5389
#2 0x00000000007c1076 in patternsel (fcinfo=<optimized out>,
ptype=ptype@entry=Pattern_Type_Like_IC, negate=negate@entry=0 '\000')
at selfuncs.c:1228
#3 0x00000000007c1680 in iclikesel (fcinfo=<optimized out>) at
selfuncs.c:1406
#4 0x000000000080db56 in OidFunctionCall4Coll (functionId=<optimized out>,
collation=collation@entry=12886, arg1=arg1@entry=28299032,
arg2=arg2@entry=1627, arg3=arg3@entry=28300096, arg4=arg4@entry=0) at
fmgr.c:1674
#5 0x000000000068e424 in restriction_selectivity (root=root@entry=0x1afcf18,
operatorid=1627, args=0x1afd340, inputcollid=12886,
varRelid=varRelid@entry=0) at plancat.c:1583
#6 0x000000000065457e in clause_selectivity (root=0x1afcf18,
clause=<optimized out>, varRelid=0, jointype=JOIN_INNER, sjinfo=0x0)
at clausesel.c:657
#7 0x000000000065485c in clauselist_selectivity (root=root@entry=0x1afcf18,
clauses=<optimized out>, varRelid=varRelid@entry=0,
jointype=jointype@entry=JOIN_INNER, sjinfo=sjinfo@entry=0x0) at
clausesel.c:107
#8 0x00000000006599d4 in set_baserel_size_estimates
(root=root@entry=0x1afcf18,
rel=rel@entry=0x1afd500) at costsize.c:3771
#9 0x00000000006526e5 in set_plain_rel_size (rte=<optimized out>,
rel=<optimized out>, root=<optimized out>) at allpaths.c:492
#10 set_rel_size (root=root@entry=0x1afcf18, rel=rel@entry=0x1afd500,
rti=rti@entry=1, rte=0x1acfb68) at allpaths.c:352
#11 0x0000000000653ebd in set_base_rel_sizes (root=<optimized out>) at
allpaths.c:272
#12 make_one_rel (root=root@entry=0x1afcf18, joinlist=joinlist@entry=0x1afd810)
at allpaths.c:170
#13 0x0000000000670ad4 in query_planner (root=root@entry=0x1afcf18,
tlist=tlist@entry=0x1afd1b0,
qp_callback=qp_callback@entry=0x6710c0 <standard_qp_callback>,
qp_extra=qp_extra@entry=0x7ffc0963f020) at planmain.c:254
#14 0x00000000006727cc in grouping_planner (root=root@entry=0x1afcf18,
inheritance_update=inheritance_update@entry=0 '\000',
tuple_fraction=<optimized out>, tuple_fraction@entry=0) at
planner.c:1729
#15 0x00000000006752c6 in subquery_planner (glob=glob@entry=0x1afcbe8,
parse=parse@entry=0x1acfa50, parent_root=parent_root@entry=0x0,
hasRecursion=hasRecursion@entry=0 '\000',
tuple_fraction=tuple_fraction@entry=0) at planner.c:789
#16 0x000000000067619f in standard_planner (parse=0x1acfa50,
cursorOptions=256, boundParams=0x0) at planner.c:301
#17 0x00000000007095bd in pg_plan_query (querytree=0x1acfa50,
cursorOptions=256, boundParams=0x0) at postgres.c:798
#18 0x000000000070968e in pg_plan_queries (querytrees=<optimized out>,
cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0)
at postgres.c:864
#19 0x000000000070b1e7 in exec_simple_query (query_string=0x1ace8c8 "SELECT
* FROM collate_test1 WHERE b ILIKE 'abc';") at postgres.c:1029
#20 PostgresMain (argc=<optimized out>, argv=argv@entry=0x1a78988,
dbname=<optimized out>, username=<optimized out>) at postgres.c:4067
#21 0x000000000046fc7d in BackendRun (port=0x1a6e6e0) at postmaster.c:4300
#22 BackendStartup (port=0x1a6e6e0) at postmaster.c:3972

[root@localhost backend]# dnf info libicu
Last metadata expiration check: 1:50:20 ago on Sun Jan 15 09:58:29 2017.
Installed Packages
Name : libicu
Arch : x86_64
Epoch : 0
Version : 57.1
Release : 4.fc25
Size : 29 M
Repo : @System

Regards

Pavel

Show quoted text

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#51

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Pavel Stehule (#50)

1 attachment(s)

Re: ICU integration

On 1/15/17 5:53 AM, Pavel Stehule wrote:

the regress test fails

Program received signal SIGSEGV, Segmentation fault.
0x00000000007bbc2b in pattern_char_isalpha (locale_is_c=0 '\000',
locale=0x1a73220, is_multibyte=1 '\001', c=97 'a') at selfuncs.c:5291
5291return isalpha_l((unsigned char) c, locale->lt);

Here is an updated patch that fixes this crash and is rebased on top of
recent changes.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v3-0001-ICU-support.patchtext/x-patch; name=v3-0001-ICU-support.patchDownload

From 286e19a1ff2888b6134ba3f55c56365416f0d1a9 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Tue, 24 Jan 2017 12:34:49 -0500
Subject: [PATCH v3] ICU support

Add a column collprovider to pg_collation that determines which library
provides the collation data.  The existing choices are default and libc,
and this adds an icu choice, which uses the ICU4C library.

The pg_locale_t type is changed to a struct that contains the
provider-specific locale handles.  Users of locale information are
changed to look into that struct for the appropriate handle to use.

Also add a collversion column that records the version of the collation
when it is created, and check at run time whether it is still the same.
This detects potentially incompatible library upgrades that can corrupt
indexes and other structures.  This is currently only supported by
ICU-provided collations.

initdb initializes the default collation set as before from the `locale
-a` output but also adds all available ICU locales with a "%icu"
appended.

Currently, ICU-provided collations can only be explicitly named
collations.  The global database locales are still always libc-provided.
---
 aclocal.m4                                |    1 +
 config/pkg.m4                             |  275 +++++++
 configure                                 |  313 ++++++++
 configure.in                              |   35 +
 doc/src/sgml/catalogs.sgml                |   19 +
 doc/src/sgml/charset.sgml                 |   43 +-
 doc/src/sgml/installation.sgml            |   14 +
 doc/src/sgml/ref/alter_collation.sgml     |   42 ++
 doc/src/sgml/ref/create_collation.sgml    |   16 +-
 src/Makefile.global.in                    |    4 +
 src/backend/Makefile                      |    2 +-
 src/backend/catalog/pg_collation.c        |   20 +-
 src/backend/commands/collationcmds.c      |  166 ++++-
 src/backend/common.mk                     |    2 +
 src/backend/nodes/copyfuncs.c             |   13 +
 src/backend/nodes/equalfuncs.c            |   11 +
 src/backend/parser/gram.y                 |   18 +-
 src/backend/regex/regc_pg_locale.c        |  110 ++-
 src/backend/tcop/utility.c                |    8 +
 src/backend/utils/adt/formatting.c        |  453 ++++++------
 src/backend/utils/adt/like.c              |   53 +-
 src/backend/utils/adt/pg_locale.c         |  160 ++++-
 src/backend/utils/adt/selfuncs.c          |    8 +-
 src/backend/utils/adt/varlena.c           |  108 ++-
 src/bin/initdb/initdb.c                   |    3 +-
 src/bin/pg_dump/pg_dump.c                 |   55 +-
 src/include/catalog/pg_collation.h        |   23 +-
 src/include/catalog/pg_collation_fn.h     |    2 +
 src/include/commands/collationcmds.h      |    1 +
 src/include/nodes/nodes.h                 |    1 +
 src/include/nodes/parsenodes.h            |   11 +
 src/include/pg_config.h.in                |    6 +
 src/include/utils/pg_locale.h             |   27 +-
 src/test/regress/GNUmakefile              |    3 +
 src/test/regress/expected/collate.icu.out | 1107 +++++++++++++++++++++++++++++
 src/test/regress/sql/collate.icu.sql      |  420 +++++++++++
 36 files changed, 3197 insertions(+), 356 deletions(-)
 create mode 100644 config/pkg.m4
 create mode 100644 src/test/regress/expected/collate.icu.out
 create mode 100644 src/test/regress/sql/collate.icu.sql

diff --git a/aclocal.m4 b/aclocal.m4
index 6f930b6fc1..5ca902b6a2 100644
--- a/aclocal.m4
+++ b/aclocal.m4
@@ -7,6 +7,7 @@ m4_include([config/docbook.m4])
 m4_include([config/general.m4])
 m4_include([config/libtool.m4])
 m4_include([config/perl.m4])
+m4_include([config/pkg.m4])
 m4_include([config/programs.m4])
 m4_include([config/python.m4])
 m4_include([config/tcl.m4])
diff --git a/config/pkg.m4 b/config/pkg.m4
new file mode 100644
index 0000000000..82bea96ee7
--- /dev/null
+++ b/config/pkg.m4
@@ -0,0 +1,275 @@
+dnl pkg.m4 - Macros to locate and utilise pkg-config.   -*- Autoconf -*-
+dnl serial 11 (pkg-config-0.29.1)
+dnl
+dnl Copyright Â© 2004 Scott James Remnant <scott@netsplit.com>.
+dnl Copyright Â© 2012-2015 Dan Nicholson <dbn.lists@gmail.com>
+dnl
+dnl This program is free software; you can redistribute it and/or modify
+dnl it under the terms of the GNU General Public License as published by
+dnl the Free Software Foundation; either version 2 of the License, or
+dnl (at your option) any later version.
+dnl
+dnl This program is distributed in the hope that it will be useful, but
+dnl WITHOUT ANY WARRANTY; without even the implied warranty of
+dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+dnl General Public License for more details.
+dnl
+dnl You should have received a copy of the GNU General Public License
+dnl along with this program; if not, write to the Free Software
+dnl Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+dnl 02111-1307, USA.
+dnl
+dnl As a special exception to the GNU General Public License, if you
+dnl distribute this file as part of a program that contains a
+dnl configuration script generated by Autoconf, you may include it under
+dnl the same distribution terms that you use for the rest of that
+dnl program.
+
+dnl PKG_PREREQ(MIN-VERSION)
+dnl -----------------------
+dnl Since: 0.29
+dnl
+dnl Verify that the version of the pkg-config macros are at least
+dnl MIN-VERSION. Unlike PKG_PROG_PKG_CONFIG, which checks the user's
+dnl installed version of pkg-config, this checks the developer's version
+dnl of pkg.m4 when generating configure.
+dnl
+dnl To ensure that this macro is defined, also add:
+dnl m4_ifndef([PKG_PREREQ],
+dnl     [m4_fatal([must install pkg-config 0.29 or later before running autoconf/autogen])])
+dnl
+dnl See the "Since" comment for each macro you use to see what version
+dnl of the macros you require.
+m4_defun([PKG_PREREQ],
+[m4_define([PKG_MACROS_VERSION], [0.29.1])
+m4_if(m4_version_compare(PKG_MACROS_VERSION, [$1]), -1,
+    [m4_fatal([pkg.m4 version $1 or higher is required but ]PKG_MACROS_VERSION[ found])])
+])dnl PKG_PREREQ
+
+dnl PKG_PROG_PKG_CONFIG([MIN-VERSION])
+dnl ----------------------------------
+dnl Since: 0.16
+dnl
+dnl Search for the pkg-config tool and set the PKG_CONFIG variable to
+dnl first found in the path. Checks that the version of pkg-config found
+dnl is at least MIN-VERSION. If MIN-VERSION is not specified, 0.9.0 is
+dnl used since that's the first version where most current features of
+dnl pkg-config existed.
+AC_DEFUN([PKG_PROG_PKG_CONFIG],
+[m4_pattern_forbid([^_?PKG_[A-Z_]+$])
+m4_pattern_allow([^PKG_CONFIG(_(PATH|LIBDIR|SYSROOT_DIR|ALLOW_SYSTEM_(CFLAGS|LIBS)))?$])
+m4_pattern_allow([^PKG_CONFIG_(DISABLE_UNINSTALLED|TOP_BUILD_DIR|DEBUG_SPEW)$])
+AC_ARG_VAR([PKG_CONFIG], [path to pkg-config utility])
+AC_ARG_VAR([PKG_CONFIG_PATH], [directories to add to pkg-config's search path])
+AC_ARG_VAR([PKG_CONFIG_LIBDIR], [path overriding pkg-config's built-in search path])
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	AC_PATH_TOOL([PKG_CONFIG], [pkg-config])
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=m4_default([$1], [0.9.0])
+	AC_MSG_CHECKING([pkg-config is at least version $_pkg_min_version])
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		AC_MSG_RESULT([yes])
+	else
+		AC_MSG_RESULT([no])
+		PKG_CONFIG=""
+	fi
+fi[]dnl
+])dnl PKG_PROG_PKG_CONFIG
+
+dnl PKG_CHECK_EXISTS(MODULES, [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------------------------------
+dnl Since: 0.18
+dnl
+dnl Check to see whether a particular set of modules exists. Similar to
+dnl PKG_CHECK_MODULES(), but does not set variables or print errors.
+dnl
+dnl Please remember that m4 expands AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+dnl only at the first occurence in configure.ac, so if the first place
+dnl it's called might be skipped (such as if it is within an "if", you
+dnl have to call PKG_CHECK_EXISTS manually
+AC_DEFUN([PKG_CHECK_EXISTS],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+if test -n "$PKG_CONFIG" && \
+    AC_RUN_LOG([$PKG_CONFIG --exists --print-errors "$1"]); then
+  m4_default([$2], [:])
+m4_ifvaln([$3], [else
+  $3])dnl
+fi])
+
+dnl _PKG_CONFIG([VARIABLE], [COMMAND], [MODULES])
+dnl ---------------------------------------------
+dnl Internal wrapper calling pkg-config via PKG_CONFIG and setting
+dnl pkg_failed based on the result.
+m4_define([_PKG_CONFIG],
+[if test -n "$$1"; then
+    pkg_cv_[]$1="$$1"
+ elif test -n "$PKG_CONFIG"; then
+    PKG_CHECK_EXISTS([$3],
+                     [pkg_cv_[]$1=`$PKG_CONFIG --[]$2 "$3" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes ],
+		     [pkg_failed=yes])
+ else
+    pkg_failed=untried
+fi[]dnl
+])dnl _PKG_CONFIG
+
+dnl _PKG_SHORT_ERRORS_SUPPORTED
+dnl ---------------------------
+dnl Internal check to see if pkg-config supports short errors.
+AC_DEFUN([_PKG_SHORT_ERRORS_SUPPORTED],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi[]dnl
+])dnl _PKG_SHORT_ERRORS_SUPPORTED
+
+
+dnl PKG_CHECK_MODULES(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl --------------------------------------------------------------
+dnl Since: 0.4.0
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES might not happen, you should be sure to include an
+dnl explicit call to PKG_PROG_PKG_CONFIG in your configure.ac
+AC_DEFUN([PKG_CHECK_MODULES],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1][_CFLAGS], [C compiler flags for $1, overriding pkg-config])dnl
+AC_ARG_VAR([$1][_LIBS], [linker flags for $1, overriding pkg-config])dnl
+
+pkg_failed=no
+AC_MSG_CHECKING([for $1])
+
+_PKG_CONFIG([$1][_CFLAGS], [cflags], [$2])
+_PKG_CONFIG([$1][_LIBS], [libs], [$2])
+
+m4_define([_PKG_TEXT], [Alternatively, you may set the environment variables $1[]_CFLAGS
+and $1[]_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.])
+
+if test $pkg_failed = yes; then
+   	AC_MSG_RESULT([no])
+        _PKG_SHORT_ERRORS_SUPPORTED
+        if test $_pkg_short_errors_supported = yes; then
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "$2" 2>&1`
+        else 
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "$2" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$$1[]_PKG_ERRORS" >&AS_MESSAGE_LOG_FD
+
+	m4_default([$4], [AC_MSG_ERROR(
+[Package requirements ($2) were not met:
+
+$$1_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+_PKG_TEXT])[]dnl
+        ])
+elif test $pkg_failed = untried; then
+     	AC_MSG_RESULT([no])
+	m4_default([$4], [AC_MSG_FAILURE(
+[The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+_PKG_TEXT
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.])[]dnl
+        ])
+else
+	$1[]_CFLAGS=$pkg_cv_[]$1[]_CFLAGS
+	$1[]_LIBS=$pkg_cv_[]$1[]_LIBS
+        AC_MSG_RESULT([yes])
+	$3
+fi[]dnl
+])dnl PKG_CHECK_MODULES
+
+
+dnl PKG_CHECK_MODULES_STATIC(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl ---------------------------------------------------------------------
+dnl Since: 0.29
+dnl
+dnl Checks for existence of MODULES and gathers its build flags with
+dnl static libraries enabled. Sets VARIABLE-PREFIX_CFLAGS from --cflags
+dnl and VARIABLE-PREFIX_LIBS from --libs.
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES_STATIC might not happen, you should be sure to
+dnl include an explicit call to PKG_PROG_PKG_CONFIG in your
+dnl configure.ac.
+AC_DEFUN([PKG_CHECK_MODULES_STATIC],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+_save_PKG_CONFIG=$PKG_CONFIG
+PKG_CONFIG="$PKG_CONFIG --static"
+PKG_CHECK_MODULES($@)
+PKG_CONFIG=$_save_PKG_CONFIG[]dnl
+])dnl PKG_CHECK_MODULES_STATIC
+
+
+dnl PKG_INSTALLDIR([DIRECTORY])
+dnl -------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable pkgconfigdir as the location where a module
+dnl should install pkg-config .pc files. By default the directory is
+dnl $libdir/pkgconfig, but the default can be changed by passing
+dnl DIRECTORY. The user can override through the --with-pkgconfigdir
+dnl parameter.
+AC_DEFUN([PKG_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${libdir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([pkgconfigdir],
+    [AS_HELP_STRING([--with-pkgconfigdir], pkg_description)],,
+    [with_pkgconfigdir=]pkg_default)
+AC_SUBST([pkgconfigdir], [$with_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_INSTALLDIR
+
+
+dnl PKG_NOARCH_INSTALLDIR([DIRECTORY])
+dnl --------------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable noarch_pkgconfigdir as the location where a
+dnl module should install arch-independent pkg-config .pc files. By
+dnl default the directory is $datadir/pkgconfig, but the default can be
+dnl changed by passing DIRECTORY. The user can override through the
+dnl --with-noarch-pkgconfigdir parameter.
+AC_DEFUN([PKG_NOARCH_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${datadir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config arch-independent installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([noarch-pkgconfigdir],
+    [AS_HELP_STRING([--with-noarch-pkgconfigdir], pkg_description)],,
+    [with_noarch_pkgconfigdir=]pkg_default)
+AC_SUBST([noarch_pkgconfigdir], [$with_noarch_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_NOARCH_INSTALLDIR
+
+
+dnl PKG_CHECK_VAR(VARIABLE, MODULE, CONFIG-VARIABLE,
+dnl [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------
+dnl Since: 0.28
+dnl
+dnl Retrieves the value of the pkg-config variable for the given module.
+AC_DEFUN([PKG_CHECK_VAR],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1], [value of $3 for $2, overriding pkg-config])dnl
+
+_PKG_CONFIG([$1], [variable="][$3]["], [$2])
+AS_VAR_COPY([$1], [pkg_cv_][$1])
+
+AS_VAR_IF([$1], [""], [$5], [$4])dnl
+])dnl PKG_CHECK_VAR
diff --git a/configure b/configure
index 9a83f19821..4f609bbcae 100755
--- a/configure
+++ b/configure
@@ -715,6 +715,12 @@ krb_srvtab
 with_python
 with_perl
 with_tcl
+ICU_LIBS
+ICU_CFLAGS
+PKG_CONFIG_LIBDIR
+PKG_CONFIG_PATH
+PKG_CONFIG
+with_icu
 enable_thread_safety
 INCLUDES
 autodepend
@@ -821,6 +827,7 @@ with_CC
 enable_depend
 enable_cassert
 enable_thread_safety
+with_icu
 with_tcl
 with_tclconfig
 with_perl
@@ -856,6 +863,11 @@ LDFLAGS
 LIBS
 CPPFLAGS
 CPP
+PKG_CONFIG
+PKG_CONFIG_PATH
+PKG_CONFIG_LIBDIR
+ICU_CFLAGS
+ICU_LIBS
 LDFLAGS_EX
 LDFLAGS_SL
 DOCBOOKSTYLE'
@@ -1511,6 +1523,7 @@ Optional Packages:
   --with-wal-segsize=SEGSIZE
                           set WAL segment size in MB [16]
   --with-CC=CMD           set compiler (deprecated)
+  --with-icu              build with ICU support
   --with-tcl              build Tcl modules (PL/Tcl)
   --with-tclconfig=DIR    tclConfig.sh is in DIR
   --with-perl             build Perl modules (PL/Perl)
@@ -1546,6 +1559,13 @@ Some influential environment variables:
   CPPFLAGS    (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if
               you have headers in a nonstandard directory <include dir>
   CPP         C preprocessor
+  PKG_CONFIG  path to pkg-config utility
+  PKG_CONFIG_PATH
+              directories to add to pkg-config's search path
+  PKG_CONFIG_LIBDIR
+              path overriding pkg-config's built-in search path
+  ICU_CFLAGS  C compiler flags for ICU, overriding pkg-config
+  ICU_LIBS    linker flags for ICU, overriding pkg-config
   LDFLAGS_EX  extra linker flags for linking executables only
   LDFLAGS_SL  extra linker flags for linking shared libraries only
   DOCBOOKSTYLE
@@ -5368,6 +5388,255 @@ $as_echo "$enable_thread_safety" >&6; }
 
 
 #
+# ICU
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with ICU support" >&5
+$as_echo_n "checking whether to build with ICU support... " >&6; }
+
+
+
+# Check whether --with-icu was given.
+if test "${with_icu+set}" = set; then :
+  withval=$with_icu;
+  case $withval in
+    yes)
+
+$as_echo "#define USE_ICU 1" >>confdefs.h
+
+      ;;
+    no)
+      :
+      ;;
+    *)
+      as_fn_error $? "no argument expected for --with-icu option" "$LINENO" 5
+      ;;
+  esac
+
+else
+  with_icu=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_icu" >&5
+$as_echo "$with_icu" >&6; }
+
+
+if test "$with_icu" = yes; then
+
+
+
+
+
+
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	if test -n "$ac_tool_prefix"; then
+  # Extract the first word of "${ac_tool_prefix}pkg-config", so it can be a program name with args.
+set dummy ${ac_tool_prefix}pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_PKG_CONFIG="$PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+PKG_CONFIG=$ac_cv_path_PKG_CONFIG
+if test -n "$PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $PKG_CONFIG" >&5
+$as_echo "$PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+fi
+if test -z "$ac_cv_path_PKG_CONFIG"; then
+  ac_pt_PKG_CONFIG=$PKG_CONFIG
+  # Extract the first word of "pkg-config", so it can be a program name with args.
+set dummy pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_ac_pt_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $ac_pt_PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_ac_pt_PKG_CONFIG="$ac_pt_PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_ac_pt_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+ac_pt_PKG_CONFIG=$ac_cv_path_ac_pt_PKG_CONFIG
+if test -n "$ac_pt_PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_pt_PKG_CONFIG" >&5
+$as_echo "$ac_pt_PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+  if test "x$ac_pt_PKG_CONFIG" = x; then
+    PKG_CONFIG=""
+  else
+    case $cross_compiling:$ac_tool_warned in
+yes:)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5
+$as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;}
+ac_tool_warned=yes ;;
+esac
+    PKG_CONFIG=$ac_pt_PKG_CONFIG
+  fi
+else
+  PKG_CONFIG="$ac_cv_path_PKG_CONFIG"
+fi
+
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=0.9.0
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: checking pkg-config is at least version $_pkg_min_version" >&5
+$as_echo_n "checking pkg-config is at least version $_pkg_min_version... " >&6; }
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+	else
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+		PKG_CONFIG=""
+	fi
+fi
+
+pkg_failed=no
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for ICU" >&5
+$as_echo_n "checking for ICU... " >&6; }
+
+if test -n "$ICU_CFLAGS"; then
+    pkg_cv_ICU_CFLAGS="$ICU_CFLAGS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_CFLAGS=`$PKG_CONFIG --cflags "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+if test -n "$ICU_LIBS"; then
+    pkg_cv_ICU_LIBS="$ICU_LIBS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_LIBS=`$PKG_CONFIG --libs "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+
+
+
+if test $pkg_failed = yes; then
+   	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi
+        if test $_pkg_short_errors_supported = yes; then
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        else
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$ICU_PKG_ERRORS" >&5
+
+	as_fn_error $? "Package requirements (icu-uc icu-i18n) were not met:
+
+$ICU_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details." "$LINENO" 5
+elif test $pkg_failed = untried; then
+     	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+	{ { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error $? "The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.
+See \`config.log' for more details" "$LINENO" 5; }
+else
+	ICU_CFLAGS=$pkg_cv_ICU_CFLAGS
+	ICU_LIBS=$pkg_cv_ICU_LIBS
+        { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+
+fi
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with Tcl" >&5
@@ -13461,6 +13730,50 @@ fi
 done
 
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ucol_strcollUTF8" >&5
+$as_echo_n "checking for ucol_strcollUTF8... " >&6; }
+if ${pgac_cv_func_ucol_strcollUTF8+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <unicode/ucol.h>
+
+int
+main ()
+{
+ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv_func_ucol_strcollUTF8=yes
+else
+  pgac_cv_func_ucol_strcollUTF8=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_func_ucol_strcollUTF8" >&5
+$as_echo "$pgac_cv_func_ucol_strcollUTF8" >&6; }
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+
+$as_echo "#define HAVE_UCOL_STRCOLLUTF8 1" >>confdefs.h
+
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/configure.in b/configure.in
index 52e4e78471..ba42f2893c 100644
--- a/configure.in
+++ b/configure.in
@@ -614,6 +614,19 @@ AC_MSG_RESULT([$enable_thread_safety])
 AC_SUBST(enable_thread_safety)
 
 #
+# ICU
+#
+AC_MSG_CHECKING([whether to build with ICU support])
+PGAC_ARG_BOOL(with, icu, no, [build with ICU support],
+              [AC_DEFINE([USE_ICU], 1, [Define to build with ICU support. (--with-icu)])])
+AC_MSG_RESULT([$with_icu])
+AC_SUBST(with_icu)
+
+if test "$with_icu" = yes; then
+  PKG_CHECK_MODULES(ICU, icu-uc icu-i18n)
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 AC_MSG_CHECKING([whether to build with Tcl])
@@ -1640,6 +1653,28 @@ fi
 AC_CHECK_FUNCS([strtoll strtoq], [break])
 AC_CHECK_FUNCS([strtoull strtouq], [break])
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  AC_CACHE_CHECK([for ucol_strcollUTF8], [pgac_cv_func_ucol_strcollUTF8],
+[ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+AC_LINK_IFELSE([AC_LANG_PROGRAM(
+[#include <unicode/ucol.h>
+],
+[ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);])],
+[pgac_cv_func_ucol_strcollUTF8=yes],
+[pgac_cv_func_ucol_strcollUTF8=no])
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS])
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+    AC_DEFINE([HAVE_UCOL_STRCOLLUTF8], 1, [Define to 1 if you have the `ucol_strcollUTF8' function.])
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 524180e011..627ff03297 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2004,6 +2004,14 @@ <title><structname>pg_collation</> Columns</title>
      </row>
 
      <row>
+      <entry><structfield>collprovider</structfield></entry>
+      <entry><type>char</type></entry>
+      <entry></entry>
+      <entry>Provider of the collation: <literal>d</literal> = database
+       default, <literal>c</literal> = libc, <literal>i</literal> = icu</entry>
+     </row>
+
+     <row>
       <entry><structfield>collencoding</structfield></entry>
       <entry><type>int4</type></entry>
       <entry></entry>
@@ -2024,6 +2032,17 @@ <title><structname>pg_collation</> Columns</title>
       <entry></entry>
       <entry><symbol>LC_CTYPE</> for this collation object</entry>
      </row>
+
+     <row>
+      <entry><structfield>collversion</structfield></entry>
+      <entry><type>int4</type></entry>
+      <entry></entry>
+      <entry>
+       Provider-specific version of the collation.  This is recorded when the
+       collation is created and then checked when it is used, to detect
+       changes in the collation definition that could lead to data corruption.
+      </entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 2aba0fc528..8116b8cdbb 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -500,16 +500,28 @@ <title>Concepts</title>
    <title>Managing Collations</title>
 
    <para>
-    A collation is an SQL schema object that maps an SQL name to
-    operating system locales.  In particular, it maps to a combination
-    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>.  (As
+    A collation is an SQL schema object that maps an SQL name to locales
+    provided by libraries installed in the operating system.  A collation
+    definition has a <firstterm>provider</firstterm> that specifies which
+    library supplies the locale data.  One standard provider name
+    is <literal>libc</literal>, which uses the locales provided by the
+    operating system C library.  These are the locales that most tools
+    provided by the operating system use.  Another provider
+    is <literal>icu</literal>, which uses the external
+    ICU<indexterm><primary>ICU</></> library.  Support for ICU has to be
+    configured when PostgreSQL is built.
+   </para>
+
+   <para>
+    A collation object provided by <literal>libc</literal> maps to a combination
+    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol> settings.  (As
     the name would suggest, the main purpose of a collation is to set
     <symbol>LC_COLLATE</symbol>, which controls the sort order.  But
     it is rarely necessary in practice to have an
     <symbol>LC_CTYPE</symbol> setting that is different from
     <symbol>LC_COLLATE</symbol>, so it is more convenient to collect
     these under one concept than to create another infrastructure for
-    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a collation
+    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a <literal>libc</literal> collation
     is tied to a character set encoding (see <xref linkend="multibyte">).
     The same collation name may exist for different encodings.
    </para>
@@ -528,8 +540,18 @@ <title>Managing Collations</title>
    </para>
 
    <para>
+    A collation provided by <literal>icu</literal> maps to a named collator
+    provided by the ICU library.  ICU does not support
+    separate <quote>collate</quote> and <quote>ctype</quote> settings, so they
+    are always the same.  Also, ICU collations are independent of the
+    encoding, so there is always only one ICU collation for a given name in a
+    database.
+   </para>
+
+   <para>
     If the operating system provides support for using multiple locales
     within a single program (<function>newlocale</> and related functions),
+    or support for ICU is configured,
     then when a database cluster is initialized, <command>initdb</command>
     populates the system catalog <literal>pg_collation</literal> with
     collations based on all the locales it finds on the operating
@@ -544,17 +566,20 @@ <title>Managing Collations</title>
     under the name <literal>de_DE</literal>, which is less cumbersome
     to write and makes the name less encoding-dependent.  Note that,
     nevertheless, the initial set of collation names is
-    platform-dependent.
+    platform-dependent.  Collations provided by ICU are created
+    with <literal>%icu</literal> appended, so <literal>de_DE%icu</literal> for
+    example.
    </para>
 
    <para>
-    In case a collation is needed that has different values for
+    In case a <literal>libc</literal> collation is needed that has different values for
     <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>, a new
     collation may be created using
     the <xref linkend="sql-createcollation"> command.  That command
     can also be used to create a new collation from an existing
     collation, which can be useful to be able to use
-    operating-system-independent collation names in applications.
+    operating-system-independent collation names in applications or use an
+    ICU-provided collation under a suffix-less name.
    </para>
 
    <para>
@@ -566,8 +591,8 @@ <title>Managing Collations</title>
     Use of the stripped collation names is recommended, since it will
     make one less thing you need to change if you decide to change to
     another database encoding.  Note however that the <literal>default</>,
-    <literal>C</>, and <literal>POSIX</> collations can be used
-    regardless of the database encoding.
+    <literal>C</>, and <literal>POSIX</> collations, as well as all collations
+    provided by ICU can be used regardless of the database encoding.
    </para>
 
    <para>
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 4431ed75a9..5c96cba51a 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -771,6 +771,20 @@ <title>Configuration</title>
       </varlistentry>
 
       <varlistentry>
+       <term><option>--with-icu</option></term>
+       <listitem>
+        <para>
+         Build with support for
+         the <productname>ICU</productname><indexterm><primary>ICU</></>
+         library.  This requires the <productname>ICU4C</productname> package
+         as well
+         as <productname>pkg-config</productname><indexterm><primary>pkg-config</></>
+         to be installed.
+        </para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
        <term><option>--with-openssl</option>
        <indexterm>
         <primary>OpenSSL</primary>
diff --git a/doc/src/sgml/ref/alter_collation.sgml b/doc/src/sgml/ref/alter_collation.sgml
index 6708c7e10e..c218539c1b 100644
--- a/doc/src/sgml/ref/alter_collation.sgml
+++ b/doc/src/sgml/ref/alter_collation.sgml
@@ -21,6 +21,8 @@
 
  <refsynopsisdiv>
 <synopsis>
+ALTER COLLATION <replaceable>name</replaceable> REFRESH VERSION
+
 ALTER COLLATION <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
 ALTER COLLATION <replaceable>name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
 ALTER COLLATION <replaceable>name</replaceable> SET SCHEMA <replaceable>new_schema</replaceable>
@@ -85,9 +87,49 @@ <title>Parameters</title>
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term><literal>REFRESH VERSION</literal></term>
+    <listitem>
+     <para>
+      Updated the collation version.
+      See <xref linkend="sql-altercollation-notes"> below.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </refsect1>
 
+ <refsect1 id="sql-altercollation-notes">
+  <title>Notes</title>
+
+  <para>
+   When using collations provided by the ICU library, the ICU-specific version
+   of the collator is recorded in the system catalog when the collation object
+   is created.  When the collation is then used, the current version is
+   checked against the recorded version, and a warning is issued when there is
+   a mismatch, for example:
+<screen>
+WARNING:  ICU collator version mismatch
+DETAIL:  The database was created using version 0x00000005, the library provides version 0x00000006.
+HINT:  Rebuild all objects affected by this collation and run ALTER COLLATION pg_catalog."xx%icu" REFRESH VERSION, or build PostgreSQL with the right version of ICU.
+</screen>
+   A change in collation definitions can lead to corrupt indexes and other
+   problems where the database system relies on stored objects having a
+   certain sort order.  Generally, this should be avoided, but it can happen
+   in legitimate circumstances, such as when
+   using <command>pg_upgrade</command> to upgrade to server binaries linked
+   with a newer version of ICU.  When this happens, all objects depending on
+   the collation should be rebuilt, for example,
+   using <command>REINDEX</command>.  When that is done, the collation version
+   can be refreshed using the command <literal>ALTER COLLATION ... REFRESH
+   VERSION</literal>.  This will update the system catalog to record the
+   current collator version and will make the warning go away.  Note that this
+   does not actually check whether all affected objects have been rebuilt
+   correctly.
+  </para>
+ </refsect1>
+
  <refsect1>
   <title>Examples</title>
 
diff --git a/doc/src/sgml/ref/create_collation.sgml b/doc/src/sgml/ref/create_collation.sgml
index d757cdfb43..b958f26fab 100644
--- a/doc/src/sgml/ref/create_collation.sgml
+++ b/doc/src/sgml/ref/create_collation.sgml
@@ -21,7 +21,8 @@
 CREATE COLLATION <replaceable>name</replaceable> (
     [ LOCALE = <replaceable>locale</replaceable>, ]
     [ LC_COLLATE = <replaceable>lc_collate</replaceable>, ]
-    [ LC_CTYPE = <replaceable>lc_ctype</replaceable> ]
+    [ LC_CTYPE = <replaceable>lc_ctype</replaceable>, ]
+    [ PROVIDER = <replaceable>provider</replaceable> ]
 )
 CREATE COLLATION <replaceable>name</replaceable> FROM <replaceable>existing_collation</replaceable>
 </synopsis>
@@ -103,6 +104,19 @@ <title>Parameters</title>
     </varlistentry>
 
     <varlistentry>
+     <term><replaceable>provider</replaceable></term>
+
+     <listitem>
+      <para>
+       Specifies the provider to use for locale services associated with this
+       collation.  Possible values
+       are: <literal>icu</literal>,<indexterm><primary>ICU</></> <literal>libc</literal>.
+       The available choices depend on the operating system and build options.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
      <term><replaceable>existing_collation</replaceable></term>
 
      <listitem>
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 59bd7996d1..218d06c2b3 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -179,6 +179,7 @@ pgxsdir = $(pkglibdir)/pgxs
 #
 # Records the choice of the various --enable-xxx and --with-xxx options.
 
+with_icu	= @with_icu@
 with_perl	= @with_perl@
 with_python	= @with_python@
 with_tcl	= @with_tcl@
@@ -208,6 +209,9 @@ python_version		= @python_version@
 
 krb_srvtab = @krb_srvtab@
 
+ICU_CFLAGS		= @ICU_CFLAGS@
+ICU_LIBS		= @ICU_LIBS@
+
 TCLSH			= @TCLSH@
 TCL_LIBS		= @TCL_LIBS@
 TCL_LIB_SPEC		= @TCL_LIB_SPEC@
diff --git a/src/backend/Makefile b/src/backend/Makefile
index 7a0bbb2942..fffb0d95ba 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -58,7 +58,7 @@ ifneq ($(PORTNAME), win32)
 ifneq ($(PORTNAME), aix)
 
 postgres: $(OBJS)
-	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) -o $@
+	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) $(ICU_LIBS) -o $@
 
 endif
 endif
diff --git a/src/backend/catalog/pg_collation.c b/src/backend/catalog/pg_collation.c
index 694c0f67f5..47c9a44a28 100644
--- a/src/backend/catalog/pg_collation.c
+++ b/src/backend/catalog/pg_collation.c
@@ -40,8 +40,10 @@
 Oid
 CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
 				const char *collcollate, const char *collctype,
+				int32 collversion,
 				bool if_not_exists)
 {
 	Relation	rel;
@@ -78,21 +80,27 @@ CollationCreate(const char *collname, Oid collnamespace,
 		{
 			ereport(NOTICE,
 				(errcode(ERRCODE_DUPLICATE_OBJECT),
-				 errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping",
-						collname, pg_encoding_to_char(collencoding))));
+				 collencoding == -1
+				 ? errmsg("collation \"%s\" already exists, skipping",
+						  collname)
+				 : errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping",
+						  collname, pg_encoding_to_char(collencoding))));
 			return InvalidOid;
 		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_DUPLICATE_OBJECT),
-					 errmsg("collation \"%s\" for encoding \"%s\" already exists",
-							collname, pg_encoding_to_char(collencoding))));
+					 collencoding == -1
+					 ? errmsg("collation \"%s\" already exists",
+							  collname)
+					 : errmsg("collation \"%s\" for encoding \"%s\" already exists",
+							  collname, pg_encoding_to_char(collencoding))));
 	}
 
 	/*
 	 * Also forbid matching an any-encoding entry.  This test of course is not
 	 * backed up by the unique index, but it's not a problem since we don't
-	 * support adding any-encoding entries after initdb.
+	 * support adding any-encoding entries after initdb. FIXME
 	 */
 	if (SearchSysCacheExists3(COLLNAMEENCNSP,
 							  PointerGetDatum(collname),
@@ -125,11 +133,13 @@ CollationCreate(const char *collname, Oid collnamespace,
 	values[Anum_pg_collation_collname - 1] = NameGetDatum(&name_name);
 	values[Anum_pg_collation_collnamespace - 1] = ObjectIdGetDatum(collnamespace);
 	values[Anum_pg_collation_collowner - 1] = ObjectIdGetDatum(collowner);
+	values[Anum_pg_collation_collprovider - 1] = CharGetDatum(collprovider);
 	values[Anum_pg_collation_collencoding - 1] = Int32GetDatum(collencoding);
 	namestrcpy(&name_collate, collcollate);
 	values[Anum_pg_collation_collcollate - 1] = NameGetDatum(&name_collate);
 	namestrcpy(&name_ctype, collctype);
 	values[Anum_pg_collation_collctype - 1] = NameGetDatum(&name_ctype);
+	values[Anum_pg_collation_collversion - 1] = Int32GetDatum(collversion);
 
 	tup = heap_form_tuple(tupDesc, values, nulls);
 
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 8d4d5b7b63..e789908e3a 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -14,11 +14,13 @@
  */
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/dependency.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/objectaccess.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_collation_fn.h"
 #include "commands/alter.h"
@@ -33,6 +35,9 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
+static int32 get_collation_version(char collprovider, const char *collcollate);
+
+
 /*
  * CREATE COLLATION
  */
@@ -47,8 +52,13 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 	DefElem    *localeEl = NULL;
 	DefElem    *lccollateEl = NULL;
 	DefElem    *lcctypeEl = NULL;
+	DefElem    *providerEl = NULL;
 	char	   *collcollate = NULL;
 	char	   *collctype = NULL;
+	char	   *collproviderstr = NULL;
+	int			collencoding;
+	char		collprovider;
+	int32		collversion;
 	Oid			newoid;
 	ObjectAddress address;
 
@@ -72,6 +82,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 			defelp = &lccollateEl;
 		else if (pg_strcasecmp(defel->defname, "lc_ctype") == 0)
 			defelp = &lcctypeEl;
+		else if (pg_strcasecmp(defel->defname, "provider") == 0)
+			defelp = &providerEl;
 		else
 		{
 			ereport(ERROR,
@@ -103,6 +115,7 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 
 		collcollate = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collcollate));
 		collctype = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collctype));
+		collprovider = ((Form_pg_collation) GETSTRUCT(tp))->collprovider;
 
 		ReleaseSysCache(tp);
 	}
@@ -119,6 +132,24 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 	if (lcctypeEl)
 		collctype = defGetString(lcctypeEl);
 
+	if (providerEl)
+		collproviderstr = defGetString(providerEl);
+
+	if (collproviderstr)
+	{
+		if (pg_strcasecmp(collproviderstr, "icu") == 0)
+			collprovider = COLLPROVIDER_ICU;
+		else if (pg_strcasecmp(collproviderstr, "libc") == 0)
+			collprovider = COLLPROVIDER_LIBC;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+					 errmsg("unrecognized collation provider: %s",
+							collproviderstr)));
+	}
+	else if (!fromEl)
+		collprovider = COLLPROVIDER_LIBC;
+
 	if (!collcollate)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
@@ -129,14 +160,24 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
 				 errmsg("parameter \"lc_ctype\" must be specified")));
 
-	check_encoding_locale_matches(GetDatabaseEncoding(), collcollate, collctype);
+	if (collprovider == COLLPROVIDER_ICU)
+		collencoding = -1;
+	else
+	{
+		collencoding = GetDatabaseEncoding();
+		check_encoding_locale_matches(collencoding, collcollate, collctype);
+	}
+
+	collversion = get_collation_version(collprovider, collcollate);
 
 	newoid = CollationCreate(collName,
 							 collNamespace,
 							 GetUserId(),
-							 GetDatabaseEncoding(),
+							 collprovider,
+							 collencoding,
 							 collcollate,
 							 collctype,
+							 collversion,
 							 false);
 
 	if (!OidIsValid(newoid))
@@ -151,6 +192,37 @@ DefineCollation(ParseState *pstate, List *names, List *parameters)
 	return address;
 }
 
+static int32
+get_collation_version(char collprovider, const char *collcollate)
+{
+	int32	collversion;
+
+#ifdef USE_ICU
+	if (collprovider == COLLPROVIDER_ICU)
+	{
+		UCollator  *collator;
+		UErrorCode	status;
+		UVersionInfo versioninfo;
+
+		status = U_ZERO_ERROR;
+		collator = ucol_open(collcollate, &status);
+		if (U_FAILURE(status))
+			ereport(ERROR,
+					(errmsg("could not open collator for locale \"%s\": %s",
+							collcollate, u_errorName(status))));
+
+		ucol_getVersion(collator, versioninfo);
+		ucol_close(collator);
+
+		collversion = ntohl(*((uint32 *) versioninfo));
+	}
+	else
+#endif
+		collversion = 0;
+
+	return collversion;
+}
+
 /*
  * Subroutine for ALTER COLLATION SET SCHEMA and RENAME
  *
@@ -182,6 +254,58 @@ IsThereCollationInNamespace(const char *collname, Oid nspOid)
 						collname, get_namespace_name(nspOid))));
 }
 
+/*
+ * ALTER COLLATION
+ */
+ObjectAddress
+AlterCollation(AlterCollationStmt *stmt)
+{
+	Relation	rel;
+	Oid			collOid;
+	HeapTuple	tup;
+	Form_pg_collation collForm;
+	int32		newversion;
+	ObjectAddress address;
+
+	rel = heap_open(CollationRelationId, RowExclusiveLock);
+	collOid = get_collation_oid(stmt->collname, false);
+
+	if (!pg_collation_ownercheck(collOid, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_COLLATION,
+					   NameListToString(stmt->collname));
+
+	tup = SearchSysCacheCopy1(COLLOID, ObjectIdGetDatum(collOid));
+	if (!HeapTupleIsValid(tup))
+		elog(ERROR, "cache lookup failed for collation %u", collOid);
+
+	collForm = (Form_pg_collation) GETSTRUCT(tup);
+
+	newversion = get_collation_version(collForm->collprovider, NameStr(collForm->collcollate));
+	if (newversion != collForm->collversion)
+	{
+		/* This version formatting is specific to ICU. */
+		ereport(NOTICE,
+				(errmsg("changing version from 0x%08X to 0x%08X",
+						(uint32) collForm->collversion, (uint32) newversion)));
+		collForm->collversion = newversion;
+	}
+	else
+		ereport(NOTICE,
+				(errmsg("version has not changed")));
+
+	simple_heap_update(rel, &tup->t_self, tup);
+	CatalogUpdateIndexes(rel, tup);
+
+	InvokeObjectPostAlterHook(CollationRelationId, collOid, 0);
+
+	ObjectAddressSet(address, CollationRelationId, collOid);
+
+	heap_freetuple(tup);
+	heap_close(rel, NoLock);
+
+	return address;
+}
+
 
 /*
  * "Normalize" a locale name, stripping off encoding tags such as
@@ -302,8 +426,8 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 
 		count++;
 
-		CollationCreate(localebuf, nspid, GetUserId(), enc,
-						localebuf, localebuf, if_not_exists);
+		CollationCreate(localebuf, nspid, GetUserId(), COLLPROVIDER_LIBC, enc,
+						localebuf, localebuf, 0, if_not_exists);
 
 		CommandCounterIncrement();
 
@@ -333,8 +457,8 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 		char	   *locale = (char *) lfirst(lcl);
 		int			enc = lfirst_int(lce);
 
-		CollationCreate(alias, nspid, GetUserId(), enc,
-						locale, locale, true);
+		CollationCreate(alias, nspid, GetUserId(), COLLPROVIDER_LIBC, enc,
+						locale, locale, 0, true);
 		CommandCounterIncrement();
 	}
 
@@ -343,5 +467,35 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 				(errmsg("no usable system locales were found")));
 #endif   /* not HAVE_LOCALE_T && not WIN32 */
 
+#ifdef USE_ICU
+	{
+		int i;
+
+		for (i = 0; i < uloc_countAvailable(); i++)
+		{
+			const char *name = uloc_getAvailable(i);
+			UCollator  *collator;
+			UErrorCode	status;
+			UVersionInfo versioninfo;
+			int32		collversion;
+
+			status = U_ZERO_ERROR;
+			collator = ucol_open(name, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not open collator for locale \"%s\": %s",
+								name, u_errorName(status))));
+			ucol_getVersion(collator, versioninfo);
+			ucol_close(collator);
+
+			collversion = ntohl(*((uint32 *) versioninfo));
+
+			CollationCreate(psprintf("%s%%icu", name),
+							nspid, GetUserId(), COLLPROVIDER_ICU, -1,
+							name, name, collversion, if_not_exists);
+		}
+	}
+#endif
+
 	PG_RETURN_VOID();
 }
diff --git a/src/backend/common.mk b/src/backend/common.mk
index 5d599dbd0c..0b57543bc4 100644
--- a/src/backend/common.mk
+++ b/src/backend/common.mk
@@ -8,6 +8,8 @@
 # this directory and SUBDIRS to subdirectories containing more things
 # to build.
 
+override CPPFLAGS := $(CPPFLAGS) $(ICU_CFLAGS)
+
 ifdef PARTIAL_LINKING
 # old style: linking using SUBSYS.o
 subsysfilename = SUBSYS.o
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 30d733e57a..26ff0f3e50 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2921,6 +2921,16 @@ _copyAlterTableCmd(const AlterTableCmd *from)
 	return newnode;
 }
 
+static AlterCollationStmt *
+_copyAlterCollationStmt(const AlterCollationStmt *from)
+{
+	AlterCollationStmt *newnode = makeNode(AlterCollationStmt);
+
+	COPY_NODE_FIELD(collname);
+
+	return newnode;
+}
+
 static AlterDomainStmt *
 _copyAlterDomainStmt(const AlterDomainStmt *from)
 {
@@ -4852,6 +4862,9 @@ copyObject(const void *from)
 		case T_AlterTableCmd:
 			retval = _copyAlterTableCmd(from);
 			break;
+		case T_AlterCollationStmt:
+			retval = _copyAlterCollationStmt(from);
+			break;
 		case T_AlterDomainStmt:
 			retval = _copyAlterDomainStmt(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 55c73b7292..8bb94bf5d6 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1066,6 +1066,14 @@ _equalAlterTableCmd(const AlterTableCmd *a, const AlterTableCmd *b)
 }
 
 static bool
+_equalAlterCollationStmt(const AlterCollationStmt *a, const AlterCollationStmt *b)
+{
+	COMPARE_NODE_FIELD(collname);
+
+	return true;
+}
+
+static bool
 _equalAlterDomainStmt(const AlterDomainStmt *a, const AlterDomainStmt *b)
 {
 	COMPARE_SCALAR_FIELD(subtype);
@@ -3110,6 +3118,9 @@ equal(const void *a, const void *b)
 		case T_AlterTableCmd:
 			retval = _equalAlterTableCmd(a, b);
 			break;
+		case T_AlterCollationStmt:
+			retval = _equalAlterCollationStmt(a, b);
+			break;
 		case T_AlterDomainStmt:
 			retval = _equalAlterDomainStmt(a, b);
 			break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a8e35feccc..b2e071e12f 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -244,7 +244,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 }
 
 %type <node>	stmt schema_stmt
-		AlterEventTrigStmt
+		AlterEventTrigStmt AlterCollationStmt
 		AlterDatabaseStmt AlterDatabaseSetStmt AlterDomainStmt AlterEnumStmt
 		AlterFdwStmt AlterForeignServerStmt AlterGroupStmt
 		AlterObjectDependsStmt AlterObjectSchemaStmt AlterOwnerStmt
@@ -806,6 +806,7 @@ stmtmulti:	stmtmulti ';' stmt
 
 stmt :
 			AlterEventTrigStmt
+			| AlterCollationStmt
 			| AlterDatabaseStmt
 			| AlterDatabaseSetStmt
 			| AlterDefaultPrivilegesStmt
@@ -9712,6 +9713,21 @@ DropdbStmt: DROP DATABASE database_name
 
 /*****************************************************************************
  *
+ *		ALTER COLLATION
+ *
+ *****************************************************************************/
+
+AlterCollationStmt: ALTER COLLATION any_name REFRESH VERSION_P
+				{
+					AlterCollationStmt *n = makeNode(AlterCollationStmt);
+					n->collname = $3;
+					$$ = (Node *)n;
+				}
+		;
+
+
+/*****************************************************************************
+ *
  *		ALTER SYSTEM
  *
  * This is used to change configuration parameters persistently.
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index afa3a7d613..d408b58011 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -68,7 +68,8 @@ typedef enum
 	PG_REGEX_LOCALE_WIDE,		/* Use <wctype.h> functions */
 	PG_REGEX_LOCALE_1BYTE,		/* Use <ctype.h> functions */
 	PG_REGEX_LOCALE_WIDE_L,		/* Use locale_t <wctype.h> functions */
-	PG_REGEX_LOCALE_1BYTE_L		/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_1BYTE_L,	/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_ICU			/* Use ICU uchar.h functions */
 } PG_Locale_Strategy;
 
 static PG_Locale_Strategy pg_regex_strategy;
@@ -262,6 +263,11 @@ pg_set_regex_collation(Oid collation)
 					 errhint("Use the COLLATE clause to set the collation explicitly.")));
 		}
 
+#ifdef USE_ICU
+		if (pg_regex_locale && pg_regex_locale->provider == COLLPROVIDER_ICU)
+			pg_regex_strategy = PG_REGEX_LOCALE_ICU;
+		else
+#endif
 #ifdef USE_WIDE_UPPER_LOWER
 		if (GetDatabaseEncoding() == PG_UTF8)
 		{
@@ -303,13 +309,18 @@ pg_wc_isdigit(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswdigit_l((wint_t) c, pg_regex_locale);
+				return iswdigit_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isdigit_l((unsigned char) c, pg_regex_locale));
+					isdigit_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isdigit(c);
 #endif
 			break;
 	}
@@ -336,13 +347,18 @@ pg_wc_isalpha(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalpha_l((wint_t) c, pg_regex_locale);
+				return iswalpha_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalpha_l((unsigned char) c, pg_regex_locale));
+					isalpha_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalpha(c);
 #endif
 			break;
 	}
@@ -369,13 +385,18 @@ pg_wc_isalnum(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalnum_l((wint_t) c, pg_regex_locale);
+				return iswalnum_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalnum_l((unsigned char) c, pg_regex_locale));
+					isalnum_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalnum(c);
 #endif
 			break;
 	}
@@ -402,13 +423,18 @@ pg_wc_isupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswupper_l((wint_t) c, pg_regex_locale);
+				return iswupper_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isupper_l((unsigned char) c, pg_regex_locale));
+					isupper_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isupper(c);
 #endif
 			break;
 	}
@@ -435,13 +461,18 @@ pg_wc_islower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswlower_l((wint_t) c, pg_regex_locale);
+				return iswlower_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					islower_l((unsigned char) c, pg_regex_locale));
+					islower_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_islower(c);
 #endif
 			break;
 	}
@@ -468,13 +499,18 @@ pg_wc_isgraph(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswgraph_l((wint_t) c, pg_regex_locale);
+				return iswgraph_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isgraph_l((unsigned char) c, pg_regex_locale));
+					isgraph_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isgraph(c);
 #endif
 			break;
 	}
@@ -501,13 +537,18 @@ pg_wc_isprint(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswprint_l((wint_t) c, pg_regex_locale);
+				return iswprint_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isprint_l((unsigned char) c, pg_regex_locale));
+					isprint_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isprint(c);
 #endif
 			break;
 	}
@@ -534,13 +575,18 @@ pg_wc_ispunct(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswpunct_l((wint_t) c, pg_regex_locale);
+				return iswpunct_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					ispunct_l((unsigned char) c, pg_regex_locale));
+					ispunct_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_ispunct(c);
 #endif
 			break;
 	}
@@ -567,13 +613,18 @@ pg_wc_isspace(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswspace_l((wint_t) c, pg_regex_locale);
+				return iswspace_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isspace_l((unsigned char) c, pg_regex_locale));
+					isspace_l((unsigned char) c, pg_regex_locale->lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isspace(c);
 #endif
 			break;
 	}
@@ -608,15 +659,20 @@ pg_wc_toupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towupper_l((wint_t) c, pg_regex_locale);
+				return towupper_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return toupper_l((unsigned char) c, pg_regex_locale);
+				return toupper_l((unsigned char) c, pg_regex_locale->lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_toupper(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -649,15 +705,20 @@ pg_wc_tolower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towlower_l((wint_t) c, pg_regex_locale);
+				return towlower_l((wint_t) c, pg_regex_locale->lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return tolower_l((unsigned char) c, pg_regex_locale);
+				return tolower_l((unsigned char) c, pg_regex_locale->lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_tolower(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -808,6 +869,9 @@ pg_ctype_get_cache(pg_wc_probefunc probefunc, int cclasscode)
 			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
 #endif
 			break;
+		case PG_REGEX_LOCALE_ICU:
+			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
+			break;
 		default:
 			max_chr = 0;		/* can't get here, but keep compiler quiet */
 			break;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 0306247177..3c387f53d2 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1612,6 +1612,10 @@ ProcessUtilitySlow(ParseState *pstate,
 				commandCollected = true;
 				break;
 
+			case T_AlterCollationStmt:
+				address = AlterCollation((AlterCollationStmt *) parsetree);
+				break;
+
 			default:
 				elog(ERROR, "unrecognized node type: %d",
 					 (int) nodeTag(parsetree));
@@ -2665,6 +2669,10 @@ CreateCommandTag(Node *parsetree)
 			tag = "DROP SUBSCRIPTION";
 			break;
 
+		case T_AlterCollationStmt:
+			tag = "ALTER COLLATION";
+			break;
+
 		case T_PrepareStmt:
 			tag = "PREPARE";
 			break;
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 16a7954ec4..7d52a516eb 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -82,6 +82,10 @@
 #include <wctype.h>
 #endif
 
+#ifdef USE_ICU
+#include <unicode/ustring.h>
+#endif
+
 #include "catalog/pg_collation.h"
 #include "mb/pg_wchar.h"
 #include "utils/builtins.h"
@@ -1443,6 +1447,42 @@ str_numth(char *dest, char *num, int type)
  *			upper/lower/initcap functions
  *****************************************************************************/
 
+#ifdef USE_ICU
+static int32_t
+icu_convert_case(int32_t (*func)(UChar *, int32_t, const UChar *, int32_t, const char *, UErrorCode *),
+				 pg_locale_t mylocale, UChar **buff_dest, UChar *buff_source, int32_t len_source)
+{
+	UErrorCode	status;
+	int32_t		len_dest;
+
+	len_dest = len_source;  /* try first with same length */
+	*buff_dest = palloc(len_dest * sizeof(**buff_dest));
+	status = U_ZERO_ERROR;
+	len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->locale, &status);
+	if (status == U_BUFFER_OVERFLOW_ERROR)
+	{
+		/* try again with adjusted length */
+		pfree(buff_dest);
+		buff_dest = palloc(len_dest * sizeof(**buff_dest));
+		status = U_ZERO_ERROR;
+		len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->locale, &status);
+	}
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("case conversion failed: %s", u_errorName(status))));
+	return len_dest;
+}
+
+static int32_t
+u_strToTitle_default_BI(UChar *dest, int32_t destCapacity,
+						const UChar *src, int32_t srcLength,
+						const char *locale,
+						UErrorCode *pErrorCode)
+{
+	return u_strToTitle(dest, destCapacity, src, srcLength, NULL, locale, pErrorCode);
+}
+#endif
+
 /*
  * If the system provides the needed functions for wide-character manipulation
  * (which are all standardized by C99), then we implement upper/lower/initcap
@@ -1479,12 +1519,9 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 		result = asc_tolower(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1502,77 +1539,79 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar;
+			int32_t		len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(mylocale, &buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToLower, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(mylocale, &result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-			else
+					if (mylocale)
+						workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->lt);
+					else
 #endif
-				workspace[curr_char] = towlower(workspace[curr_char]);
-		}
+						workspace[curr_char] = towlower(workspace[curr_char]);
+				}
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
+			}
 #endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
-
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
+			else
 			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for lower() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that tolower_l() will not be so broken as to need
-		 * an isupper_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that tolower_l() will not be so broken as to need
+				 * an isupper_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = tolower_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = tolower_l((unsigned char) *p, mylocale->lt);
+					else
 #endif
-				*p = pg_tolower((unsigned char) *p);
+						*p = pg_tolower((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1599,12 +1638,9 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 		result = asc_toupper(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1622,77 +1658,78 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+			len_uchar = icu_to_uchar(mylocale, &buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToUpper, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(mylocale, &result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-			else
-#endif
-				workspace[curr_char] = towupper(workspace[curr_char]);
-		}
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
+					if (mylocale)
+						workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->lt);
+					else
 #endif
-		char	   *p;
+						workspace[curr_char] = towupper(workspace[curr_char]);
+				}
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for upper() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
+
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+#endif   /* USE_WIDE_UPPER_LOWER */
+			else
+			{
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that toupper_l() will not be so broken as to need
-		 * an islower_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that toupper_l() will not be so broken as to need
+				 * an islower_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = toupper_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = toupper_l((unsigned char) *p, mylocale->lt);
+					else
 #endif
-				*p = pg_toupper((unsigned char) *p);
+						*p = pg_toupper((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1720,12 +1757,9 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 		result = asc_initcap(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1743,100 +1777,101 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
-
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
-
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
-
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
 		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-				else
-					workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-				wasalnum = iswalnum_l(workspace[curr_char], mylocale);
-			}
-			else
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(mylocale, &buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(mylocale, &result, buff_conv, len_conv);
+		}
+		else
 #endif
+		{
+			if (pg_database_encoding_max_length() > 1)
 			{
-				if (wasalnum)
-					workspace[curr_char] = towlower(workspace[curr_char]);
-				else
-					workspace[curr_char] = towupper(workspace[curr_char]);
-				wasalnum = iswalnum(workspace[curr_char]);
-			}
-		}
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for initcap() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
+					if (mylocale)
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->lt);
+						else
+							workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->lt);
+						wasalnum = iswalnum_l(workspace[curr_char], mylocale->lt);
+					}
+					else
 #endif
-		}
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower(workspace[curr_char]);
+						else
+							workspace[curr_char] = towupper(workspace[curr_char]);
+						wasalnum = iswalnum(workspace[curr_char]);
+					}
+				}
 
-		result = pnstrdup(buff, nbytes);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		/*
-		 * Note: we assume that toupper_l()/tolower_l() will not be so broken
-		 * as to need guard tests.  When using the default collation, we apply
-		 * the traditional Postgres behavior that forces ASCII-style treatment
-		 * of I/i, but in non-default collations you get exactly what the
-		 * collation says.
-		 */
-		for (p = result; *p; p++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					*p = tolower_l((unsigned char) *p, mylocale);
-				else
-					*p = toupper_l((unsigned char) *p, mylocale);
-				wasalnum = isalnum_l((unsigned char) *p, mylocale);
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
+#endif   /* USE_WIDE_UPPER_LOWER */
 			else
-#endif
 			{
-				if (wasalnum)
-					*p = pg_tolower((unsigned char) *p);
-				else
-					*p = pg_toupper((unsigned char) *p);
-				wasalnum = isalnum((unsigned char) *p);
+				char	   *p;
+
+				result = pnstrdup(buff, nbytes);
+
+				/*
+				 * Note: we assume that toupper_l()/tolower_l() will not be so broken
+				 * as to need guard tests.  When using the default collation, we apply
+				 * the traditional Postgres behavior that forces ASCII-style treatment
+				 * of I/i, but in non-default collations you get exactly what the
+				 * collation says.
+				 */
+				for (p = result; *p; p++)
+				{
+#ifdef HAVE_LOCALE_T
+					if (mylocale)
+					{
+						if (wasalnum)
+							*p = tolower_l((unsigned char) *p, mylocale->lt);
+						else
+							*p = toupper_l((unsigned char) *p, mylocale->lt);
+						wasalnum = isalnum_l((unsigned char) *p, mylocale->lt);
+					}
+					else
+#endif
+					{
+						if (wasalnum)
+							*p = pg_tolower((unsigned char) *p);
+						else
+							*p = pg_toupper((unsigned char) *p);
+						wasalnum = isalnum((unsigned char) *p);
+					}
+				}
 			}
 		}
 	}
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index 91fe109867..09ebba6e19 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -96,7 +96,7 @@ SB_lower_char(unsigned char c, pg_locale_t locale, bool locale_is_c)
 		return pg_ascii_tolower(c);
 #ifdef HAVE_LOCALE_T
 	else if (locale)
-		return tolower_l(c, locale);
+		return tolower_l(c, locale->lt);
 #endif
 	else
 		return pg_tolower(c);
@@ -165,14 +165,36 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 			   *p;
 	int			slen,
 				plen;
+	pg_locale_t locale = 0;
+	bool		locale_is_c = false;
+
+	if (lc_ctype_is_c(collation))
+		locale_is_c = true;
+	else if (collation != DEFAULT_COLLATION_OID)
+	{
+		if (!OidIsValid(collation))
+		{
+			/*
+			 * This typically means that the parser could not resolve a
+			 * conflict of implicit collations, so report it that way.
+			 */
+			ereport(ERROR,
+					(errcode(ERRCODE_INDETERMINATE_COLLATION),
+					 errmsg("could not determine which collation to use for ILIKE"),
+					 errhint("Use the COLLATE clause to set the collation explicitly.")));
+		}
+		locale = pg_newlocale_from_collation(collation);
+	}
 
 	/*
 	 * For efficiency reasons, in the single byte case we don't call lower()
 	 * on the pattern and text, but instead call SB_lower_char on each
-	 * character.  In the multi-byte case we don't have much choice :-(
+	 * character.  In the multi-byte case we don't have much choice :-(.
+	 * Also, ICU does not support single-character case folding, so we go the
+	 * long way.
 	 */
 
-	if (pg_database_encoding_max_length() > 1)
+	if (pg_database_encoding_max_length() > 1 || locale->provider == COLLPROVIDER_ICU)
 	{
 		/* lower's result is never packed, so OK to use old macros here */
 		pat = DatumGetTextP(DirectFunctionCall1Coll(lower, collation,
@@ -190,31 +212,6 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 	}
 	else
 	{
-		/*
-		 * Here we need to prepare locale information for SB_lower_char. This
-		 * should match the methods used in str_tolower().
-		 */
-		pg_locale_t locale = 0;
-		bool		locale_is_c = false;
-
-		if (lc_ctype_is_c(collation))
-			locale_is_c = true;
-		else if (collation != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collation))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for ILIKE"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-			locale = pg_newlocale_from_collation(collation);
-		}
-
 		p = VARDATA_ANY(pat);
 		plen = VARSIZE_ANY_EXHDR(pat);
 		s = VARDATA_ANY(str);
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 4c7f1dad50..8c3fd97d0f 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -58,7 +58,9 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_control.h"
 #include "mb/pg_wchar.h"
+#include "utils/builtins.h"
 #include "utils/hsearch.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
 #include "utils/syscache.h"
@@ -1288,45 +1290,105 @@ pg_newlocale_from_collation(Oid collid)
 		collcollate = NameStr(collform->collcollate);
 		collctype = NameStr(collform->collctype);
 
-		if (strcmp(collcollate, collctype) == 0)
+		cache_entry->locale = malloc(sizeof(* cache_entry->locale));
+		memset(cache_entry->locale, 0, sizeof(* cache_entry->locale));
+		cache_entry->locale->provider = collform->collprovider;
+
+		if (collform->collprovider == COLLPROVIDER_LIBC)
 		{
-			/* Normal case where they're the same */
+			if (strcmp(collcollate, collctype) == 0)
+			{
+				/* Normal case where they're the same */
 #ifndef WIN32
-			result = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
-							   NULL);
+				result = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
+								   NULL);
 #else
-			result = _create_locale(LC_ALL, collcollate);
+				result = _create_locale(LC_ALL, collcollate);
 #endif
-			if (!result)
-				report_newlocale_failure(collcollate);
-		}
-		else
-		{
+				if (!result)
+					report_newlocale_failure(collcollate);
+			}
+			else
+			{
 #ifndef WIN32
-			/* We need two newlocale() steps */
-			locale_t	loc1;
-
-			loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
-			if (!loc1)
-				report_newlocale_failure(collcollate);
-			result = newlocale(LC_CTYPE_MASK, collctype, loc1);
-			if (!result)
-				report_newlocale_failure(collctype);
+				/* We need two newlocale() steps */
+				locale_t	loc1;
+
+				loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
+				if (!loc1)
+					report_newlocale_failure(collcollate);
+				result = newlocale(LC_CTYPE_MASK, collctype, loc1);
+				if (!result)
+					report_newlocale_failure(collctype);
 #else
 
-			/*
-			 * XXX The _create_locale() API doesn't appear to support this.
-			 * Could perhaps be worked around by changing pg_locale_t to
-			 * contain two separate fields.
-			 */
+				/*
+				 * XXX The _create_locale() API doesn't appear to support this.
+				 * Could perhaps be worked around by changing pg_locale_t to
+				 * contain two separate fields.
+				 */
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("collations with different collate and ctype values are not supported on this platform")));
+#endif
+			}
+
+			cache_entry->locale->lt = result;
+		}
+		else if (collform->collprovider == COLLPROVIDER_ICU)
+		{
+#ifdef USE_ICU
+			const char *collcollate;
+			UCollator  *collator;
+			const char *encstring;
+			UConverter *converter;
+			UErrorCode	status;
+			UVersionInfo versioninfo;
+			int32		numversion;
+
+			collcollate = NameStr(collform->collcollate);
+
+			status = U_ZERO_ERROR;
+			collator = ucol_open(collcollate, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not open collator for locale \"%s\": %s",
+								collcollate, u_errorName(status))));
+
+			encstring = (&pg_enc2name_tbl[GetDatabaseEncoding()])->name;
+
+			status = U_ZERO_ERROR;
+			converter = ucnv_open(encstring, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not open converter for encoding \"%s\": %s",
+								encstring, u_errorName(status))));
+
+			ucol_getVersion(collator, versioninfo);
+			numversion = ntohl(*((uint32 *) versioninfo));
+
+			if (numversion != collform->collversion)
+				ereport(WARNING,
+						(errmsg("ICU collator version mismatch"),
+						 errdetail("The database was created using version 0x%08X, the library provides version 0x%08X.",
+								   (uint32) collform->collversion, (uint32) numversion),
+						 errhint("Rebuild all objects affected by this collation and run ALTER COLLATION %s REFRESH VERSION, "
+								 "or build PostgreSQL with the right version of ICU.",
+								 quote_qualified_identifier(get_namespace_name(collform->collnamespace),
+															NameStr(collform->collname)))));
+
+			cache_entry->locale->ucol = collator;
+			cache_entry->locale->locale = strdup(collcollate);
+			cache_entry->locale->ucnv = converter;
+#else /* not USE_ICU */
+			/* could get here if a collation was created by a build with ICU */
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					 errmsg("collations with different collate and ctype values are not supported on this platform")));
-#endif
+					 errmsg("ICU is not supported in this build"), \
+					 errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif /* not USE_ICU */
 		}
 
-		cache_entry->locale = result;
-
 		ReleaseSysCache(tp);
 #else							/* not HAVE_LOCALE_T */
 
@@ -1344,6 +1406,40 @@ pg_newlocale_from_collation(Oid collid)
 }
 
 
+#ifdef USE_ICU
+int32_t
+icu_to_uchar(pg_locale_t mylocale, UChar **buff_uchar, const char *buff, size_t nbytes)
+{
+	UErrorCode	status;
+	int32_t		len_uchar;
+
+	len_uchar = 2 * nbytes;  /* max length per docs */
+	*buff_uchar = palloc(len_uchar * sizeof(**buff_uchar));
+	status = U_ZERO_ERROR;
+	len_uchar = ucnv_toUChars(mylocale->ucnv, *buff_uchar, len_uchar, buff, nbytes, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_toUChars failed: %s", u_errorName(status))));
+	return len_uchar;
+}
+
+int32_t
+icu_from_uchar(pg_locale_t mylocale, char **result, UChar *buff_uchar, int32_t len_uchar)
+{
+	UErrorCode	status;
+	int32_t		len_result;
+
+	len_result = UCNV_GET_MAX_BYTES_FOR_STRING(len_uchar, ucnv_getMaxCharSize(mylocale->ucnv));
+	*result = palloc(len_result + 1);
+	status = U_ZERO_ERROR;
+	ucnv_fromUChars(mylocale->ucnv, *result, len_result, buff_uchar, len_uchar, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_fromUChars failed: %s", u_errorName(status))));
+	return len_result;
+}
+#endif
+
 /*
  * These functions convert from/to libc's wchar_t, *not* pg_wchar_t.
  * Therefore we keep them here rather than with the mbutils code.
@@ -1363,6 +1459,8 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1399,7 +1497,7 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_WCSTOMBS_L
 		/* Use wcstombs_l for nondefault locales */
-		result = wcstombs_l(to, from, tolen, locale);
+		result = wcstombs_l(to, from, tolen, locale->lt);
 #else							/* !HAVE_WCSTOMBS_L */
 		/* We have to temporarily set the locale as current ... ugh */
 		locale_t	save_locale = uselocale(locale);
@@ -1433,6 +1531,8 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1474,7 +1574,7 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_MBSTOWCS_L
 			/* Use mbstowcs_l for nondefault locales */
-			result = mbstowcs_l(to, str, tolen, locale);
+			result = mbstowcs_l(to, str, tolen, locale->lt);
 #else							/* !HAVE_MBSTOWCS_L */
 			/* We have to temporarily set the locale as current ... ugh */
 			locale_t	save_locale = uselocale(locale);
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fa32e9eabe..f803c767d3 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5275,7 +5275,7 @@ find_join_input_rel(PlannerInfo *root, Relids relids)
 /*
  * Check whether char is a letter (and, hence, subject to case-folding)
  *
- * In multibyte character sets, we can't use isalpha, and it does not seem
+ * In multibyte character sets or with ICU, we can't use isalpha, and it does not seem
  * worth trying to convert to wchar_t to use iswalpha.  Instead, just assume
  * any multibyte char is potentially case-varying.
  */
@@ -5287,9 +5287,11 @@ pattern_char_isalpha(char c, bool is_multibyte,
 		return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
 	else if (is_multibyte && IS_HIGHBIT_SET(c))
 		return true;
+	else if (locale && locale->provider == COLLPROVIDER_ICU)
+		return IS_HIGHBIT_SET(c) ? true : false;
 #ifdef HAVE_LOCALE_T
-	else if (locale)
-		return isalpha_l((unsigned char) c, locale);
+	else if (locale && locale->provider == COLLPROVIDER_LIBC)
+		return isalpha_l((unsigned char) c, locale->lt);
 #endif
 	else
 		return isalpha((unsigned char) c);
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 254379ade7..0b79f70d3a 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1403,10 +1403,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		char		a2buf[TEXTBUFLEN];
 		char	   *a1p,
 				   *a2p;
-
-#ifdef HAVE_LOCALE_T
 		pg_locale_t mylocale = 0;
-#endif
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1421,9 +1418,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			mylocale = pg_newlocale_from_collation(collid);
-#endif
 		}
 
 		/*
@@ -1542,9 +1537,44 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		memcpy(a2p, arg2, len2);
 		a2p[len2] = '\0';
 
-#ifdef HAVE_LOCALE_T
 		if (mylocale)
-			result = strcoll_l(a1p, a2p, mylocale);
+		{
+#ifdef USE_ICU
+			if (mylocale->provider == COLLPROVIDER_ICU)
+			{
+#ifdef HAVE_UCOL_STRCOLLUTF8
+				if (GetDatabaseEncoding() == PG_UTF8)
+				{
+					UErrorCode	status;
+
+					status = U_ZERO_ERROR;
+					result = ucol_strcollUTF8(mylocale->ucol,
+											  arg1, len1,
+											  arg2, len2,
+											  &status);
+					if (U_FAILURE(status))
+						ereport(ERROR,
+								(errmsg("collation failed: %s", u_errorName(status))));
+				}
+				else
+#endif
+				{
+					int32_t ulen1, ulen2;
+					UChar *uchar1, *uchar2;
+
+					ulen1 = icu_to_uchar(mylocale, &uchar1, arg1, len1);
+					ulen2 = icu_to_uchar(mylocale, &uchar2, arg2, len2);
+
+					result = ucol_strcoll(mylocale->ucol,
+										  uchar1, ulen1,
+										  uchar2, ulen2);
+				}
+			}
+			else
+#endif
+#ifdef HAVE_LOCALE_T
+				result = strcoll_l(a1p, a2p, mylocale->lt);
+		}
 		else
 #endif
 			result = strcoll(a1p, a2p);
@@ -1768,10 +1798,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	bool		abbreviate = ssup->abbreviate;
 	bool		collate_c = false;
 	VarStringSortSupport *sss;
-
-#ifdef HAVE_LOCALE_T
 	pg_locale_t locale = 0;
-#endif
 
 	/*
 	 * If possible, set ssup->comparator to a function which can be used to
@@ -1826,9 +1853,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			locale = pg_newlocale_from_collation(collid);
-#endif
 		}
 	}
 
@@ -1854,7 +1879,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	 * platforms.
 	 */
 #ifndef TRUST_STRXFRM
-	if (!collate_c)
+	if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
 		abbreviate = false;
 #endif
 
@@ -2090,9 +2115,44 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 		goto done;
 	}
 
-#ifdef HAVE_LOCALE_T
 	if (sss->locale)
-		result = strcoll_l(sss->buf1, sss->buf2, sss->locale);
+	{
+#ifdef USE_ICU
+		if (sss->locale->provider == COLLPROVIDER_ICU)
+		{
+#ifdef HAVE_UCOL_STRCOLLUTF8
+			if (GetDatabaseEncoding() == PG_UTF8)
+			{
+				UErrorCode	status;
+
+				status = U_ZERO_ERROR;
+				result = ucol_strcollUTF8(sss->locale->ucol,
+										  a1p, len1,
+										  a2p, len2,
+										  &status);
+				if (U_FAILURE(status))
+					ereport(ERROR,
+							(errmsg("collation failed: %s", u_errorName(status))));
+			}
+			else
+#endif
+			{
+				int32_t ulen1, ulen2;
+				UChar *uchar1, *uchar2;
+
+				ulen1 = icu_to_uchar(sss->locale, &uchar1, a1p, len1);
+				ulen2 = icu_to_uchar(sss->locale, &uchar2, a2p, len2);
+
+				result = ucol_strcoll(sss->locale->ucol,
+									  uchar1, ulen1,
+									  uchar2, ulen2);
+			}
+		}
+		else
+#endif
+#ifdef HAVE_LOCALE_T
+			result = strcoll_l(sss->buf1, sss->buf2, sss->locale->lt);
+	}
 	else
 #endif
 		result = strcoll(sss->buf1, sss->buf2);
@@ -2200,6 +2260,10 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 	else
 	{
 		Size		bsize;
+#ifdef USE_ICU
+		int32_t		ulen = -1;
+		UChar	   *uchar;
+#endif
 
 		/*
 		 * We're not using the C collation, so fall back on strxfrm.
@@ -2227,12 +2291,22 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 		sss->buf1[len] = '\0';
 		sss->last_len1 = len;
 
+#ifdef USE_ICU
+		if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+			ulen = icu_to_uchar(sss->locale, &uchar, sss->buf1, len);
+#endif
+
 		for (;;)
 		{
+#ifdef USE_ICU
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+				bsize = ucol_getSortKey(sss->locale->ucol, uchar, ulen, (uint8_t *) sss->buf2, sss->buflen2);
+			else
+#endif
 #ifdef HAVE_LOCALE_T
-			if (sss->locale)
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_LIBC)
 				bsize = strxfrm_l(sss->buf2, sss->buf1,
-								  sss->buflen2, sss->locale);
+								  sss->buflen2, sss->locale->lt);
 			else
 #endif
 				bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 443c2ee468..22428417e8 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -62,6 +62,7 @@
 
 #include "catalog/catalog.h"
 #include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
 #include "common/file_utils.h"
 #include "common/restricted_token.h"
 #include "common/username.h"
@@ -1618,7 +1619,7 @@ setup_collation(FILE *cmdfd)
 	PG_CMD_PUTS("SELECT pg_import_system_collations(if_not_exists => false, schema => 'pg_catalog');\n\n");
 
 	/* Add an SQL-standard name */
-	PG_CMD_PRINTF2("INSERT INTO pg_collation (collname, collnamespace, collowner, collencoding, collcollate, collctype) VALUES ('ucs_basic', 'pg_catalog'::regnamespace, %u, %d, 'C', 'C');\n\n", BOOTSTRAP_SUPERUSERID, PG_UTF8);
+	PG_CMD_PRINTF3("INSERT INTO pg_collation (collname, collnamespace, collowner, collprovider, collencoding, collcollate, collctype, collversion) VALUES ('ucs_basic', 'pg_catalog'::regnamespace, %u, '%c', %d, 'C', 'C', 0);\n\n", BOOTSTRAP_SUPERUSERID, COLLPROVIDER_LIBC, PG_UTF8);
 }
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e3cca62bf7..a0941ff172 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -12714,8 +12714,10 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	PQExpBuffer delq;
 	PQExpBuffer labelq;
 	PGresult   *res;
+	int			i_collprovider;
 	int			i_collcollate;
 	int			i_collctype;
+	const char *collprovider;
 	const char *collcollate;
 	const char *collctype;
 
@@ -12732,18 +12734,30 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	selectSourceSchema(fout, collinfo->dobj.namespace->dobj.name);
 
 	/* Get collation-specific details */
-	appendPQExpBuffer(query, "SELECT "
-					  "collcollate, "
-					  "collctype "
-					  "FROM pg_catalog.pg_collation c "
-					  "WHERE c.oid = '%u'::pg_catalog.oid",
-					  collinfo->dobj.catId.oid);
+	if (fout->remoteVersion >= 100000)
+		appendPQExpBuffer(query, "SELECT "
+						  "collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
+	else
+		appendPQExpBuffer(query, "SELECT "
+						  "'p'::char AS collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
 
 	res = ExecuteSqlQueryForSingleRow(fout, query->data);
 
+	i_collprovider = PQfnumber(res, "collprovider");
 	i_collcollate = PQfnumber(res, "collcollate");
 	i_collctype = PQfnumber(res, "collctype");
 
+	collprovider = PQgetvalue(res, 0, i_collprovider);
 	collcollate = PQgetvalue(res, 0, i_collcollate);
 	collctype = PQgetvalue(res, 0, i_collctype);
 
@@ -12755,11 +12769,32 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	appendPQExpBuffer(delq, ".%s;\n",
 					  fmtId(collinfo->dobj.name));
 
-	appendPQExpBuffer(q, "CREATE COLLATION %s (lc_collate = ",
+	appendPQExpBuffer(q, "CREATE COLLATION %s (",
 					  fmtId(collinfo->dobj.name));
-	appendStringLiteralAH(q, collcollate, fout);
-	appendPQExpBufferStr(q, ", lc_ctype = ");
-	appendStringLiteralAH(q, collctype, fout);
+
+	appendPQExpBufferStr(q, "provider = ");
+	if (collprovider[0] == 'c')
+		appendStringLiteralAH(q, "libc", fout);
+	else if (collprovider[0] == 'i')
+		appendStringLiteralAH(q, "icu", fout);
+	else
+		exit_horribly(NULL,
+					  "unrecognized collation provider: %s\n",
+					  collprovider);
+
+	if (strcmp(collcollate, collctype) == 0)
+	{
+		appendPQExpBufferStr(q, ", locale = ");
+		appendStringLiteralAH(q, collcollate, fout);
+	}
+	else
+	{
+		appendPQExpBufferStr(q, ", lc_collate = ");
+		appendStringLiteralAH(q, collcollate, fout);
+		appendPQExpBufferStr(q, ", lc_ctype = ");
+		appendStringLiteralAH(q, collctype, fout);
+	}
+
 	appendPQExpBufferStr(q, ");\n");
 
 	appendPQExpBuffer(labelq, "COLLATION %s", fmtId(collinfo->dobj.name));
diff --git a/src/include/catalog/pg_collation.h b/src/include/catalog/pg_collation.h
index 30c87e004e..ccc24fe416 100644
--- a/src/include/catalog/pg_collation.h
+++ b/src/include/catalog/pg_collation.h
@@ -34,9 +34,11 @@ CATALOG(pg_collation,3456)
 	NameData	collname;		/* collation name */
 	Oid			collnamespace;	/* OID of namespace containing collation */
 	Oid			collowner;		/* owner of collation */
+	char		collprovider;	/* see constants below */
 	int32		collencoding;	/* encoding for this collation; -1 = "all" */
 	NameData	collcollate;	/* LC_COLLATE setting */
 	NameData	collctype;		/* LC_CTYPE setting */
+	int32		collversion;	/* provider-dependent version of collation data */
 } FormData_pg_collation;
 
 /* ----------------
@@ -50,27 +52,34 @@ typedef FormData_pg_collation *Form_pg_collation;
  *		compiler constants for pg_collation
  * ----------------
  */
-#define Natts_pg_collation				6
+#define Natts_pg_collation				8
 #define Anum_pg_collation_collname		1
 #define Anum_pg_collation_collnamespace 2
 #define Anum_pg_collation_collowner		3
-#define Anum_pg_collation_collencoding	4
-#define Anum_pg_collation_collcollate	5
-#define Anum_pg_collation_collctype		6
+#define Anum_pg_collation_collprovider	4
+#define Anum_pg_collation_collencoding	5
+#define Anum_pg_collation_collcollate	6
+#define Anum_pg_collation_collctype		7
+#define Anum_pg_collation_collversion	8
 
 /* ----------------
  *		initial contents of pg_collation
  * ----------------
  */
 
-DATA(insert OID = 100 ( default		PGNSP PGUID -1 "" "" ));
+DATA(insert OID = 100 ( default		PGNSP PGUID d -1 "" "" 0 ));
 DESCR("database's default collation");
 #define DEFAULT_COLLATION_OID	100
-DATA(insert OID = 950 ( C			PGNSP PGUID -1 "C" "C" ));
+DATA(insert OID = 950 ( C			PGNSP PGUID c -1 "C" "C" 0 ));
 DESCR("standard C collation");
 #define C_COLLATION_OID			950
-DATA(insert OID = 951 ( POSIX		PGNSP PGUID -1 "POSIX" "POSIX" ));
+DATA(insert OID = 951 ( POSIX		PGNSP PGUID c -1 "POSIX" "POSIX" 0 ));
 DESCR("standard POSIX collation");
 #define POSIX_COLLATION_OID		951
 
+
+#define COLLPROVIDER_DEFAULT	'd'
+#define COLLPROVIDER_ICU		'i'
+#define COLLPROVIDER_LIBC		'c'
+
 #endif   /* PG_COLLATION_H */
diff --git a/src/include/catalog/pg_collation_fn.h b/src/include/catalog/pg_collation_fn.h
index 482ba7920e..a4b2ca5d58 100644
--- a/src/include/catalog/pg_collation_fn.h
+++ b/src/include/catalog/pg_collation_fn.h
@@ -16,8 +16,10 @@
 
 extern Oid CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
 				const char *collcollate, const char *collctype,
+				int32 collversion,
 				bool if_not_exists);
 extern void RemoveCollationById(Oid collationOid);
 
diff --git a/src/include/commands/collationcmds.h b/src/include/commands/collationcmds.h
index 699ce2f9ee..64f800c78f 100644
--- a/src/include/commands/collationcmds.h
+++ b/src/include/commands/collationcmds.h
@@ -20,5 +20,6 @@
 
 extern ObjectAddress DefineCollation(ParseState *pstate, List *names, List *parameters);
 extern void IsThereCollationInNamespace(const char *collname, Oid nspOid);
+extern ObjectAddress AlterCollation(AlterCollationStmt *stmt);
 
 #endif   /* COLLATIONCMDS_H */
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index fa4932a902..0831fa7487 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -416,6 +416,7 @@ typedef enum NodeTag
 	T_CreateSubscriptionStmt,
 	T_AlterSubscriptionStmt,
 	T_DropSubscriptionStmt,
+	T_AlterCollationStmt,
 
 	/*
 	 * TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index aad4699f48..3ef2da56ff 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1694,6 +1694,17 @@ typedef struct AlterTableCmd	/* one subcommand of an ALTER TABLE */
 
 
 /* ----------------------
+ * Alter Collation
+ * ----------------------
+ */
+typedef struct AlterCollationStmt
+{
+	NodeTag		type;
+	List	   *collname;
+} AlterCollationStmt;
+
+
+/* ----------------------
  *	Alter Domain
  *
  * The fields are used in different ways by the different variants of
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index b9dfdd41c1..0b48f30379 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -618,6 +618,9 @@
 /* Define to 1 if you have the external array `tzname'. */
 #undef HAVE_TZNAME
 
+/* Define to 1 if you have the `ucol_strcollUTF8' function. */
+#undef HAVE_UCOL_STRCOLLUTF8
+
 /* Define to 1 if you have the <ucred.h> header file. */
 #undef HAVE_UCRED_H
 
@@ -831,6 +834,9 @@
    (--enable-float8-byval) */
 #undef USE_FLOAT8_BYVAL
 
+/* Define to build with ICU support. (--with-icu) */
+#undef USE_ICU
+
 /* Define to 1 if you want 64-bit integer timestamp and interval support.
    (--enable-integer-datetimes) */
 #undef USE_INTEGER_DATETIMES
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index be91f9b88c..e4c774e8ec 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -16,6 +16,10 @@
 #if defined(LOCALE_T_IN_XLOCALE) || defined(WCSTOMBS_L_IN_XLOCALE)
 #include <xlocale.h>
 #endif
+#ifdef USE_ICU
+#include <unicode/ucnv.h>
+#include <unicode/ucol.h>
+#endif
 
 #include "utils/guc.h"
 
@@ -62,17 +66,30 @@ extern void cache_locale_time(void);
  * We define our own wrapper around locale_t so we can keep the same
  * function signatures for all builds, while not having to create a
  * fake version of the standard type locale_t in the global namespace.
- * The fake version of pg_locale_t can be checked for truth; that's
- * about all it will be needed for.
+ * pg_locale_t is occasionally checked for truth, so make it a pointer.
  */
+struct pg_locale_t
+{
+	char	provider;
 #ifdef HAVE_LOCALE_T
-typedef locale_t pg_locale_t;
-#else
-typedef int pg_locale_t;
+	locale_t lt;
+#endif
+#ifdef USE_ICU
+	const char *locale;
+	UCollator *ucol;
+	UConverter *ucnv;
 #endif
+};
+
+typedef struct pg_locale_t *pg_locale_t;
 
 extern pg_locale_t pg_newlocale_from_collation(Oid collid);
 
+#ifdef USE_ICU
+extern int32_t icu_to_uchar(pg_locale_t mylocale, UChar **buff_uchar, const char *buff, size_t nbytes);
+extern int32_t icu_from_uchar(pg_locale_t mylocale, char **result, UChar *buff_uchar, int32_t len_uchar);
+#endif
+
 /* These functions convert from/to libc's wchar_t, *not* pg_wchar_t */
 #ifdef USE_WIDE_UPPER_LOWER
 extern size_t wchar2char(char *to, const wchar_t *from, size_t tolen,
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index b923ea1420..0f08f659bf 100644
--- a/src/test/regress/GNUmakefile
+++ b/src/test/regress/GNUmakefile
@@ -125,6 +125,9 @@ tablespace-setup:
 ##
 
 REGRESS_OPTS = --dlpath=. $(EXTRA_REGRESS_OPTS)
+ifeq ($(with_icu),yes)
+EXTRA_TESTS += collate.icu
+endif
 
 check: all tablespace-setup
 	$(pg_regress_check) $(REGRESS_OPTS) --schedule=$(srcdir)/parallel_schedule $(MAXCONNOPT) $(EXTRA_TESTS)
diff --git a/src/test/regress/expected/collate.icu.out b/src/test/regress/expected/collate.icu.out
new file mode 100644
index 0000000000..6787a9638a
--- /dev/null
+++ b/src/test/regress/expected/collate.icu.out
@@ -0,0 +1,1107 @@
+/*
+ * This test is for ICU collations.
+ */
+SET client_encoding TO UTF8;
+CREATE TABLE collate_test1 (
+    a int,
+    b text COLLATE "en_US%icu" NOT NULL
+);
+\d collate_test1
+           Table "public.collate_test1"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+ b      | text    | en_US%icu | not null | 
+
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "ja_JP.eucjp%icu"
+);
+ERROR:  collation "ja_JP.eucjp%icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "ja_JP.eucjp%icu"
+                   ^
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "foo%icu"
+);
+ERROR:  collation "foo%icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "foo%icu"
+                   ^
+CREATE TABLE collate_test_fail (
+    a int COLLATE "en_US%icu",
+    b text
+);
+ERROR:  collations are not supported by type integer
+LINE 2:     a int COLLATE "en_US%icu",
+                  ^
+CREATE TABLE collate_test_like (
+    LIKE collate_test1
+);
+\d collate_test_like
+         Table "public.collate_test_like"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+ b      | text    | en_US%icu | not null | 
+
+CREATE TABLE collate_test2 (
+    a int,
+    b text COLLATE "sv_SE%icu"
+);
+CREATE TABLE collate_test3 (
+    a int,
+    b text COLLATE "C"
+);
+INSERT INTO collate_test1 VALUES (1, 'abc'), (2, 'Ã¤bc'), (3, 'bbc'), (4, 'ABC');
+INSERT INTO collate_test2 SELECT * FROM collate_test1;
+INSERT INTO collate_test3 SELECT * FROM collate_test1;
+SELECT * FROM collate_test1 WHERE b >= 'bbc';
+ a |  b  
+---+-----
+ 3 | bbc
+(1 row)
+
+SELECT * FROM collate_test2 WHERE b >= 'bbc';
+ a |  b  
+---+-----
+ 2 | Ã¤bc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test3 WHERE b >= 'bbc';
+ a |  b  
+---+-----
+ 2 | Ã¤bc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test3 WHERE b >= 'BBC';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | Ã¤bc
+ 3 | bbc
+(3 rows)
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+ a |  b  
+---+-----
+ 2 | Ã¤bc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b >= 'bbc' COLLATE "C";
+ a |  b  
+---+-----
+ 2 | Ã¤bc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
+ a |  b  
+---+-----
+ 2 | Ã¤bc
+ 3 | bbc
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US%icu";
+ERROR:  collation mismatch between explicit collations "C" and "en_US%icu"
+LINE 1: ...* FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "e...
+                                                             ^
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE%icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE%icu"; -- fails
+ERROR:  collations are not supported by type integer
+CREATE TABLE collate_test4 (
+    a int,
+    b testdomain_sv
+);
+INSERT INTO collate_test4 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test4 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+CREATE TABLE collate_test5 (
+    a int,
+    b testdomain_sv COLLATE "en_US%icu"
+);
+INSERT INTO collate_test5 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test5 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | Ã¤bc
+ 3 | bbc
+(4 rows)
+
+SELECT a, b FROM collate_test1 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | Ã¤bc
+ 3 | bbc
+(4 rows)
+
+SELECT a, b FROM collate_test2 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, b FROM collate_test3 ORDER BY b;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+-- star expansion
+SELECT * FROM collate_test1 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | Ã¤bc
+ 3 | bbc
+(4 rows)
+
+SELECT * FROM collate_test2 ORDER BY b;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT * FROM collate_test3 ORDER BY b;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+-- constant expression folding
+SELECT 'bbc' COLLATE "en_US%icu" > 'Ã¤bc' COLLATE "en_US%icu" AS "true";
+ true 
+------
+ t
+(1 row)
+
+SELECT 'bbc' COLLATE "sv_SE%icu" > 'Ã¤bc' COLLATE "sv_SE%icu" AS "false";
+ false 
+-------
+ f
+(1 row)
+
+-- upper/lower
+CREATE TABLE collate_test10 (
+    a int,
+    x text COLLATE "en_US%icu",
+    y text COLLATE "tr_TR%icu"
+);
+INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
+SELECT a, lower(x), lower(y), upper(x), upper(y), initcap(x), initcap(y) FROM collate_test10;
+ a | lower | lower | upper | upper | initcap | initcap 
+---+-------+-------+-------+-------+---------+---------
+ 1 | hij   | hij   | HIJ   | HÄ°J   | Hij     | Hij
+ 2 | hij   | hÄ±j   | HIJ   | HIJ   | Hij     | HÄ±j
+(2 rows)
+
+SELECT a, lower(x COLLATE "C"), lower(y COLLATE "C") FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hij
+(2 rows)
+
+SELECT a, x, y FROM collate_test10 ORDER BY lower(y), a;
+ a |  x  |  y  
+---+-----+-----
+ 2 | HIJ | HIJ
+ 1 | hij | hij
+(2 rows)
+
+-- LIKE/ILIKE
+SELECT * FROM collate_test1 WHERE b LIKE 'abc';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b LIKE 'abc%';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b LIKE '%bc%';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | Ã¤bc
+ 3 | bbc
+(3 rows)
+
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | Ã¤bc
+ 3 | bbc
+ 4 | ABC
+(4 rows)
+
+SELECT 'TÃ¼rkiye' COLLATE "en_US%icu" ILIKE '%KI%' AS "true";
+ true 
+------
+ t
+(1 row)
+
+SELECT 'TÃ¼rkiye' COLLATE "tr_TR%icu" ILIKE '%KI%' AS "false";
+ false 
+-------
+ f
+(1 row)
+
+SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "en_US%icu" AS "false";
+ false 
+-------
+ f
+(1 row)
+
+SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "tr_TR%icu" AS "true";
+ true 
+------
+ t
+(1 row)
+
+-- The following actually exercises the selectivity estimation for ILIKE.
+SELECT relname FROM pg_class WHERE relname ILIKE 'abc%';
+ relname 
+---------
+(0 rows)
+
+-- regular expressions
+SELECT * FROM collate_test1 WHERE b ~ '^abc$';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b ~ '^abc';
+ a |  b  
+---+-----
+ 1 | abc
+(1 row)
+
+SELECT * FROM collate_test1 WHERE b ~ 'bc';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | Ã¤bc
+ 3 | bbc
+(3 rows)
+
+SELECT * FROM collate_test1 WHERE b ~* '^abc$';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ~* '^abc';
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+(2 rows)
+
+SELECT * FROM collate_test1 WHERE b ~* 'bc';
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | Ã¤bc
+ 3 | bbc
+ 4 | ABC
+(4 rows)
+
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US%icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'Ã¤bÃ§'), (10, 'ÃBÃ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+  b  | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space 
+-----+----------+----------+----------+----------+----------+----------+----------+----------+----------
+ abc | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ABC | t        | t        | f        | f        | t        | t        | t        | f        | f
+ 123 | f        | f        | f        | t        | t        | t        | t        | f        | f
+ ab1 | f        | f        | f        | f        | t        | t        | t        | f        | f
+ a1! | f        | f        | f        | f        | f        | t        | t        | f        | f
+ a c | f        | f        | f        | f        | f        | f        | t        | f        | f
+ !.; | f        | f        | f        | f        | f        | t        | t        | t        | f
+     | f        | f        | f        | f        | f        | f        | t        | f        | t
+ Ã¤bÃ§ | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ÃBÃ | t        | t        | f        | f        | t        | t        | t        | f        | f
+(10 rows)
+
+SELECT 'TÃ¼rkiye' COLLATE "en_US%icu" ~* 'KI' AS "true";
+ true 
+------
+ t
+(1 row)
+
+SELECT 'TÃ¼rkiye' COLLATE "tr_TR%icu" ~* 'KI' AS "true";  -- true with ICU
+ true 
+------
+ t
+(1 row)
+
+SELECT 'bÄ±t' ~* 'BIT' COLLATE "en_US%icu" AS "false";
+ false 
+-------
+ f
+(1 row)
+
+SELECT 'bÄ±t' ~* 'BIT' COLLATE "tr_TR%icu" AS "false";  -- false with ICU
+ false 
+-------
+ f
+(1 row)
+
+-- The following actually exercises the selectivity estimation for ~*.
+SELECT relname FROM pg_class WHERE relname ~* '^abc';
+ relname 
+---------
+(0 rows)
+
+-- to_char
+SET lc_time TO 'tr_TR';
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
+   to_char   
+-------------
+ 01 NIS 2010
+(1 row)
+
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR%icu");
+   to_char   
+-------------
+ 01 NÄ°S 2010
+(1 row)
+
+-- backwards parsing
+CREATE VIEW collview1 AS SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+CREATE VIEW collview2 AS SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+CREATE VIEW collview3 AS SELECT a, lower((x || x) COLLATE "C") FROM collate_test10;
+SELECT table_name, view_definition FROM information_schema.views
+  WHERE table_name LIKE 'collview%' ORDER BY 1;
+ table_name |                             view_definition                              
+------------+--------------------------------------------------------------------------
+ collview1  |  SELECT collate_test1.a,                                                +
+            |     collate_test1.b                                                     +
+            |    FROM collate_test1                                                   +
+            |   WHERE ((collate_test1.b COLLATE "C") >= 'bbc'::text);
+ collview2  |  SELECT collate_test1.a,                                                +
+            |     collate_test1.b                                                     +
+            |    FROM collate_test1                                                   +
+            |   ORDER BY (collate_test1.b COLLATE "C");
+ collview3  |  SELECT collate_test10.a,                                               +
+            |     lower(((collate_test10.x || collate_test10.x) COLLATE "C")) AS lower+
+            |    FROM collate_test10;
+(3 rows)
+
+-- collation propagation in various expression types
+SELECT a, coalesce(b, 'foo') FROM collate_test1 ORDER BY 2;
+ a | coalesce 
+---+----------
+ 1 | abc
+ 4 | ABC
+ 2 | Ã¤bc
+ 3 | bbc
+(4 rows)
+
+SELECT a, coalesce(b, 'foo') FROM collate_test2 ORDER BY 2;
+ a | coalesce 
+---+----------
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, coalesce(b, 'foo') FROM collate_test3 ORDER BY 2;
+ a | coalesce 
+---+----------
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, lower(coalesce(x, 'foo')), lower(coalesce(y, 'foo')) FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hÄ±j
+(2 rows)
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;
+ a |  b  | greatest 
+---+-----+----------
+ 1 | abc | CCC
+ 2 | Ã¤bc | CCC
+ 3 | bbc | CCC
+ 4 | ABC | CCC
+(4 rows)
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test2 ORDER BY 3;
+ a |  b  | greatest 
+---+-----+----------
+ 1 | abc | CCC
+ 3 | bbc | CCC
+ 4 | ABC | CCC
+ 2 | Ã¤bc | Ã¤bc
+(4 rows)
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test3 ORDER BY 3;
+ a |  b  | greatest 
+---+-----+----------
+ 4 | ABC | CCC
+ 1 | abc | abc
+ 3 | bbc | bbc
+ 2 | Ã¤bc | Ã¤bc
+(4 rows)
+
+SELECT a, x, y, lower(greatest(x, 'foo')), lower(greatest(y, 'foo')) FROM collate_test10;
+ a |  x  |  y  | lower | lower 
+---+-----+-----+-------+-------
+ 1 | hij | hij | hij   | hij
+ 2 | HIJ | HIJ | hij   | hÄ±j
+(2 rows)
+
+SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;
+ a | nullif 
+---+--------
+ 4 | ABC
+ 2 | Ã¤bc
+ 3 | bbc
+ 1 | 
+(4 rows)
+
+SELECT a, nullif(b, 'abc') FROM collate_test2 ORDER BY 2;
+ a | nullif 
+---+--------
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+ 1 | 
+(4 rows)
+
+SELECT a, nullif(b, 'abc') FROM collate_test3 ORDER BY 2;
+ a | nullif 
+---+--------
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+ 1 | 
+(4 rows)
+
+SELECT a, lower(nullif(x, 'foo')), lower(nullif(y, 'foo')) FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hÄ±j
+(2 rows)
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test1 ORDER BY 2;
+ a |  b   
+---+------
+ 4 | ABC
+ 2 | Ã¤bc
+ 1 | abcd
+ 3 | bbc
+(4 rows)
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test2 ORDER BY 2;
+ a |  b   
+---+------
+ 4 | ABC
+ 1 | abcd
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test3 ORDER BY 2;
+ a |  b   
+---+------
+ 4 | ABC
+ 1 | abcd
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+CREATE DOMAIN testdomain AS text;
+SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | Ã¤bc
+ 3 | bbc
+(4 rows)
+
+SELECT a, b::testdomain FROM collate_test2 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, b::testdomain FROM collate_test3 ORDER BY 2;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, b::testdomain_sv FROM collate_test3 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, lower(x::testdomain), lower(y::testdomain) FROM collate_test10;
+ a | lower | lower 
+---+-------+-------
+ 1 | hij   | hij
+ 2 | hij   | hÄ±j
+(2 rows)
+
+SELECT min(b), max(b) FROM collate_test1;
+ min | max 
+-----+-----
+ abc | bbc
+(1 row)
+
+SELECT min(b), max(b) FROM collate_test2;
+ min | max 
+-----+-----
+ abc | Ã¤bc
+(1 row)
+
+SELECT min(b), max(b) FROM collate_test3;
+ min | max 
+-----+-----
+ ABC | Ã¤bc
+(1 row)
+
+SELECT array_agg(b ORDER BY b) FROM collate_test1;
+     array_agg     
+-------------------
+ {abc,ABC,Ã¤bc,bbc}
+(1 row)
+
+SELECT array_agg(b ORDER BY b) FROM collate_test2;
+     array_agg     
+-------------------
+ {abc,ABC,bbc,Ã¤bc}
+(1 row)
+
+SELECT array_agg(b ORDER BY b) FROM collate_test3;
+     array_agg     
+-------------------
+ {ABC,abc,bbc,Ã¤bc}
+(1 row)
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test1 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 1 | abc
+ 4 | ABC
+ 4 | ABC
+ 2 | Ã¤bc
+ 2 | Ã¤bc
+ 3 | bbc
+ 3 | bbc
+(8 rows)
+
+SELECT a, b FROM collate_test2 UNION SELECT a, b FROM collate_test2 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, b FROM collate_test3 WHERE a < 4 INTERSECT SELECT a, b FROM collate_test3 WHERE a > 1 ORDER BY 2;
+ a |  b  
+---+-----
+ 3 | bbc
+ 2 | Ã¤bc
+(2 rows)
+
+SELECT a, b FROM collate_test3 EXCEPT SELECT a, b FROM collate_test3 WHERE a < 2 ORDER BY 2;
+ a |  b  
+---+-----
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+(3 rows)
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  could not determine which collation to use for string comparison
+HINT:  Use the COLLATE clause to set the collation explicitly.
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- ok
+ a |  b  
+---+-----
+ 1 | abc
+ 2 | Ã¤bc
+ 3 | bbc
+ 4 | ABC
+ 1 | abc
+ 2 | Ã¤bc
+ 3 | bbc
+ 4 | ABC
+(8 rows)
+
+SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "C"
+LINE 1: SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collat...
+                                                       ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+SELECT a, b COLLATE "C" FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- ok
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "C"
+LINE 1: ...ELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM col...
+                                                             ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "C"
+LINE 1: SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM colla...
+                                                        ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- fail
+ERROR:  no collation was derived for column "b" with collatable type text
+HINT:  Use the COLLATE clause to set the collation explicitly.
+-- ideally this would be a parse-time error, but for now it must be run-time:
+select x < y from collate_test10; -- fail
+ERROR:  could not determine which collation to use for string comparison
+HINT:  Use the COLLATE clause to set the collation explicitly.
+select x || y from collate_test10; -- ok, because || is not collation aware
+ ?column? 
+----------
+ hijhij
+ HIJHIJ
+(2 rows)
+
+select x, y from collate_test10 order by x || y; -- not so ok
+ERROR:  collation mismatch between implicit collations "en_US%icu" and "tr_TR%icu"
+LINE 1: select x, y from collate_test10 order by x || y;
+                                                      ^
+HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
+-- collation mismatch between recursive and non-recursive term
+WITH RECURSIVE foo(x) AS
+   (SELECT x FROM (VALUES('a' COLLATE "en_US%icu"),('b')) t(x)
+   UNION ALL
+   SELECT (x || 'c') COLLATE "de_DE%icu" FROM foo WHERE length(x) < 10)
+SELECT * FROM foo;
+ERROR:  recursive query "foo" column 1 has collation "en_US%icu" in non-recursive term but collation "de_DE%icu" overall
+LINE 2:    (SELECT x FROM (VALUES('a' COLLATE "en_US%icu"),('b')) t(...
+                   ^
+HINT:  Use the COLLATE clause to set the collation of the non-recursive term.
+-- casting
+SELECT CAST('42' AS text COLLATE "C");
+ERROR:  syntax error at or near "COLLATE"
+LINE 1: SELECT CAST('42' AS text COLLATE "C");
+                                 ^
+SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | Ã¤bc
+ 3 | bbc
+(4 rows)
+
+SELECT a, CAST(b AS varchar) FROM collate_test2 ORDER BY 2;
+ a |  b  
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, CAST(b AS varchar) FROM collate_test3 ORDER BY 2;
+ a |  b  
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+-- propagation of collation in SQL functions (inlined and non-inlined cases)
+-- and plpgsql functions too
+CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 $$;
+CREATE FUNCTION mylt_noninline (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 limit 1 $$;
+CREATE FUNCTION mylt_plpgsql (text, text) RETURNS boolean LANGUAGE plpgsql
+    AS $$ begin return $1 < $2; end $$;
+SELECT a.b AS a, b.b AS b, a.b < b.b AS lt,
+       mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b)
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+  a  |  b  | lt | mylt | mylt_noninline | mylt_plpgsql 
+-----+-----+----+------+----------------+--------------
+ abc | abc | f  | f    | f              | f
+ abc | ABC | t  | t    | t              | t
+ abc | Ã¤bc | t  | t    | t              | t
+ abc | bbc | t  | t    | t              | t
+ ABC | abc | f  | f    | f              | f
+ ABC | ABC | f  | f    | f              | f
+ ABC | Ã¤bc | t  | t    | t              | t
+ ABC | bbc | t  | t    | t              | t
+ Ã¤bc | abc | f  | f    | f              | f
+ Ã¤bc | ABC | f  | f    | f              | f
+ Ã¤bc | Ã¤bc | f  | f    | f              | f
+ Ã¤bc | bbc | t  | t    | t              | t
+ bbc | abc | f  | f    | f              | f
+ bbc | ABC | f  | f    | f              | f
+ bbc | Ã¤bc | f  | f    | f              | f
+ bbc | bbc | f  | f    | f              | f
+(16 rows)
+
+SELECT a.b AS a, b.b AS b, a.b < b.b COLLATE "C" AS lt,
+       mylt(a.b, b.b COLLATE "C"), mylt_noninline(a.b, b.b COLLATE "C"),
+       mylt_plpgsql(a.b, b.b COLLATE "C")
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+  a  |  b  | lt | mylt | mylt_noninline | mylt_plpgsql 
+-----+-----+----+------+----------------+--------------
+ abc | abc | f  | f    | f              | f
+ abc | ABC | f  | f    | f              | f
+ abc | Ã¤bc | t  | t    | t              | t
+ abc | bbc | t  | t    | t              | t
+ ABC | abc | t  | t    | t              | t
+ ABC | ABC | f  | f    | f              | f
+ ABC | Ã¤bc | t  | t    | t              | t
+ ABC | bbc | t  | t    | t              | t
+ Ã¤bc | abc | f  | f    | f              | f
+ Ã¤bc | ABC | f  | f    | f              | f
+ Ã¤bc | Ã¤bc | f  | f    | f              | f
+ Ã¤bc | bbc | f  | f    | f              | f
+ bbc | abc | f  | f    | f              | f
+ bbc | ABC | f  | f    | f              | f
+ bbc | Ã¤bc | t  | t    | t              | t
+ bbc | bbc | f  | f    | f              | f
+(16 rows)
+
+-- collation override in plpgsql
+CREATE FUNCTION mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+SELECT mylt2('a', 'B' collate "en_US%icu") as t, mylt2('a', 'B' collate "C") as f;
+ t | f 
+---+---
+ t | f
+(1 row)
+
+CREATE OR REPLACE FUNCTION
+  mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text COLLATE "POSIX" := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+SELECT mylt2('a', 'B') as f;
+ f 
+---
+ f
+(1 row)
+
+SELECT mylt2('a', 'B' collate "C") as fail; -- conflicting collations
+ERROR:  could not determine which collation to use for string comparison
+HINT:  Use the COLLATE clause to set the collation explicitly.
+CONTEXT:  PL/pgSQL function mylt2(text,text) line 6 at RETURN
+SELECT mylt2('a', 'B' collate "POSIX") as f;
+ f 
+---
+ f
+(1 row)
+
+-- polymorphism
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test1)) ORDER BY 1;
+ unnest 
+--------
+ abc
+ ABC
+ Ã¤bc
+ bbc
+(4 rows)
+
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test2)) ORDER BY 1;
+ unnest 
+--------
+ abc
+ ABC
+ bbc
+ Ã¤bc
+(4 rows)
+
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test3)) ORDER BY 1;
+ unnest 
+--------
+ ABC
+ abc
+ bbc
+ Ã¤bc
+(4 rows)
+
+CREATE FUNCTION dup (anyelement) RETURNS anyelement
+    AS 'select $1' LANGUAGE sql;
+SELECT a, dup(b) FROM collate_test1 ORDER BY 2;
+ a | dup 
+---+-----
+ 1 | abc
+ 4 | ABC
+ 2 | Ã¤bc
+ 3 | bbc
+(4 rows)
+
+SELECT a, dup(b) FROM collate_test2 ORDER BY 2;
+ a | dup 
+---+-----
+ 1 | abc
+ 4 | ABC
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+SELECT a, dup(b) FROM collate_test3 ORDER BY 2;
+ a | dup 
+---+-----
+ 4 | ABC
+ 1 | abc
+ 3 | bbc
+ 2 | Ã¤bc
+(4 rows)
+
+-- indexes
+CREATE INDEX collate_test1_idx1 ON collate_test1 (b);
+CREATE INDEX collate_test1_idx2 ON collate_test1 (b COLLATE "C");
+CREATE INDEX collate_test1_idx3 ON collate_test1 ((b COLLATE "C")); -- this is different grammatically
+CREATE INDEX collate_test1_idx4 ON collate_test1 (((b||'foo') COLLATE "POSIX"));
+CREATE INDEX collate_test1_idx5 ON collate_test1 (a COLLATE "C"); -- fail
+ERROR:  collations are not supported by type integer
+CREATE INDEX collate_test1_idx6 ON collate_test1 ((a COLLATE "C")); -- fail
+ERROR:  collations are not supported by type integer
+LINE 1: ...ATE INDEX collate_test1_idx6 ON collate_test1 ((a COLLATE "C...
+                                                             ^
+SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_test%_idx%' ORDER BY 1;
+      relname       |                                           pg_get_indexdef                                           
+--------------------+-----------------------------------------------------------------------------------------------------
+ collate_test1_idx1 | CREATE INDEX collate_test1_idx1 ON collate_test1 USING btree (b)
+ collate_test1_idx2 | CREATE INDEX collate_test1_idx2 ON collate_test1 USING btree (b COLLATE "C")
+ collate_test1_idx3 | CREATE INDEX collate_test1_idx3 ON collate_test1 USING btree (b COLLATE "C")
+ collate_test1_idx4 | CREATE INDEX collate_test1_idx4 ON collate_test1 USING btree (((b || 'foo'::text)) COLLATE "POSIX")
+(4 rows)
+
+-- schema manipulation commands
+CREATE ROLE regress_test_role;
+CREATE SCHEMA test_schema;
+-- We need to do this this way to cope with varying names for encodings:
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
+          quote_literal(current_setting('lc_collate')) || ');';
+END
+$$;
+CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
+ERROR:  collation "test0" already exists
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
+          quote_literal(current_setting('lc_collate')) ||
+          ', lc_ctype = ' ||
+          quote_literal(current_setting('lc_ctype')) || ');';
+END
+$$;
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+ERROR:  parameter "lc_ctype" must be specified
+--CREATE COLLATION testx (provider = icu, locale = 'nonsense'); -- never fails
+CREATE COLLATION test4 FROM nonsense;
+ERROR:  collation "nonsense" for encoding "UTF8" does not exist
+CREATE COLLATION test5 FROM test0;
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
+ collname 
+----------
+ test0
+ test1
+ test5
+(3 rows)
+
+ALTER COLLATION test1 RENAME TO test11;
+ALTER COLLATION test0 RENAME TO test11; -- fail
+ERROR:  collation "test11" already exists in schema "public"
+ALTER COLLATION test1 RENAME TO test22; -- fail
+ERROR:  collation "test1" for encoding "UTF8" does not exist
+ALTER COLLATION test11 OWNER TO regress_test_role;
+ALTER COLLATION test11 OWNER TO nonsense;
+ERROR:  role "nonsense" does not exist
+ALTER COLLATION test11 SET SCHEMA test_schema;
+COMMENT ON COLLATION test0 IS 'US English';
+SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
+    FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
+    WHERE collname LIKE 'test%'
+    ORDER BY 1;
+ collname |   nspname   | obj_description 
+----------+-------------+-----------------
+ test0    | public      | US English
+ test11   | test_schema | 
+ test5    | public      | 
+(3 rows)
+
+DROP COLLATION test0, test_schema.test11, test5;
+DROP COLLATION test0; -- fail
+ERROR:  collation "test0" for encoding "UTF8" does not exist
+DROP COLLATION IF EXISTS test0;
+NOTICE:  collation "test0" does not exist, skipping
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
+ collname 
+----------
+(0 rows)
+
+DROP SCHEMA test_schema;
+DROP ROLE regress_test_role;
+-- ALTER
+ALTER COLLATION "en_US%icu" REFRESH VERSION;
+NOTICE:  version has not changed
+-- dependencies
+CREATE COLLATION test0 FROM "C";
+CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
+CREATE DOMAIN collate_dep_dom1 AS text COLLATE test0;
+CREATE TYPE collate_dep_test2 AS (x int, y text COLLATE test0);
+CREATE VIEW collate_dep_test3 AS SELECT text 'foo' COLLATE test0 AS foo;
+CREATE TABLE collate_dep_test4t (a int, b text);
+CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
+DROP COLLATION test0 RESTRICT; -- fail
+ERROR:  cannot drop collation test0 because other objects depend on it
+DETAIL:  table collate_dep_test1 column b depends on collation test0
+type collate_dep_dom1 depends on collation test0
+composite type collate_dep_test2 column y depends on collation test0
+view collate_dep_test3 depends on collation test0
+index collate_dep_test4i depends on collation test0
+HINT:  Use DROP ... CASCADE to drop the dependent objects too.
+DROP COLLATION test0 CASCADE;
+NOTICE:  drop cascades to 5 other objects
+DETAIL:  drop cascades to table collate_dep_test1 column b
+drop cascades to type collate_dep_dom1
+drop cascades to composite type collate_dep_test2 column y
+drop cascades to view collate_dep_test3
+drop cascades to index collate_dep_test4i
+\d collate_dep_test1
+         Table "public.collate_dep_test1"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ a      | integer |           |          | 
+
+\d collate_dep_test2
+     Composite type "public.collate_dep_test2"
+ Column |  Type   | Collation | Nullable | Default 
+--------+---------+-----------+----------+---------
+ x      | integer |           |          | 
+
+DROP TABLE collate_dep_test1, collate_dep_test4t;
+DROP TYPE collate_dep_test2;
+-- test range types and collations
+create type textrange_c as range(subtype=text, collation="C");
+create type textrange_en_us as range(subtype=text, collation="en_US%icu");
+select textrange_c('A','Z') @> 'b'::text;
+ ?column? 
+----------
+ f
+(1 row)
+
+select textrange_en_us('A','Z') @> 'b'::text;
+ ?column? 
+----------
+ t
+(1 row)
+
+drop type textrange_c;
+drop type textrange_en_us;
diff --git a/src/test/regress/sql/collate.icu.sql b/src/test/regress/sql/collate.icu.sql
new file mode 100644
index 0000000000..3dcabbb3fd
--- /dev/null
+++ b/src/test/regress/sql/collate.icu.sql
@@ -0,0 +1,420 @@
+/*
+ * This test is for ICU collations.
+ */
+
+SET client_encoding TO UTF8;
+
+
+CREATE TABLE collate_test1 (
+    a int,
+    b text COLLATE "en_US%icu" NOT NULL
+);
+
+\d collate_test1
+
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "ja_JP.eucjp%icu"
+);
+
+CREATE TABLE collate_test_fail (
+    a int,
+    b text COLLATE "foo%icu"
+);
+
+CREATE TABLE collate_test_fail (
+    a int COLLATE "en_US%icu",
+    b text
+);
+
+CREATE TABLE collate_test_like (
+    LIKE collate_test1
+);
+
+\d collate_test_like
+
+CREATE TABLE collate_test2 (
+    a int,
+    b text COLLATE "sv_SE%icu"
+);
+
+CREATE TABLE collate_test3 (
+    a int,
+    b text COLLATE "C"
+);
+
+INSERT INTO collate_test1 VALUES (1, 'abc'), (2, 'Ã¤bc'), (3, 'bbc'), (4, 'ABC');
+INSERT INTO collate_test2 SELECT * FROM collate_test1;
+INSERT INTO collate_test3 SELECT * FROM collate_test1;
+
+SELECT * FROM collate_test1 WHERE b >= 'bbc';
+SELECT * FROM collate_test2 WHERE b >= 'bbc';
+SELECT * FROM collate_test3 WHERE b >= 'bbc';
+SELECT * FROM collate_test3 WHERE b >= 'BBC';
+
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+SELECT * FROM collate_test1 WHERE b >= 'bbc' COLLATE "C";
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US%icu";
+
+
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE%icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE%icu"; -- fails
+CREATE TABLE collate_test4 (
+    a int,
+    b testdomain_sv
+);
+INSERT INTO collate_test4 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test4 ORDER BY b;
+
+CREATE TABLE collate_test5 (
+    a int,
+    b testdomain_sv COLLATE "en_US%icu"
+);
+INSERT INTO collate_test5 SELECT * FROM collate_test1;
+SELECT a, b FROM collate_test5 ORDER BY b;
+
+
+SELECT a, b FROM collate_test1 ORDER BY b;
+SELECT a, b FROM collate_test2 ORDER BY b;
+SELECT a, b FROM collate_test3 ORDER BY b;
+
+SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+
+-- star expansion
+SELECT * FROM collate_test1 ORDER BY b;
+SELECT * FROM collate_test2 ORDER BY b;
+SELECT * FROM collate_test3 ORDER BY b;
+
+-- constant expression folding
+SELECT 'bbc' COLLATE "en_US%icu" > 'Ã¤bc' COLLATE "en_US%icu" AS "true";
+SELECT 'bbc' COLLATE "sv_SE%icu" > 'Ã¤bc' COLLATE "sv_SE%icu" AS "false";
+
+-- upper/lower
+
+CREATE TABLE collate_test10 (
+    a int,
+    x text COLLATE "en_US%icu",
+    y text COLLATE "tr_TR%icu"
+);
+
+INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
+
+SELECT a, lower(x), lower(y), upper(x), upper(y), initcap(x), initcap(y) FROM collate_test10;
+SELECT a, lower(x COLLATE "C"), lower(y COLLATE "C") FROM collate_test10;
+
+SELECT a, x, y FROM collate_test10 ORDER BY lower(y), a;
+
+-- LIKE/ILIKE
+
+SELECT * FROM collate_test1 WHERE b LIKE 'abc';
+SELECT * FROM collate_test1 WHERE b LIKE 'abc%';
+SELECT * FROM collate_test1 WHERE b LIKE '%bc%';
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc';
+SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';
+SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
+
+SELECT 'TÃ¼rkiye' COLLATE "en_US%icu" ILIKE '%KI%' AS "true";
+SELECT 'TÃ¼rkiye' COLLATE "tr_TR%icu" ILIKE '%KI%' AS "false";
+
+SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "en_US%icu" AS "false";
+SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "tr_TR%icu" AS "true";
+
+-- The following actually exercises the selectivity estimation for ILIKE.
+SELECT relname FROM pg_class WHERE relname ILIKE 'abc%';
+
+-- regular expressions
+
+SELECT * FROM collate_test1 WHERE b ~ '^abc$';
+SELECT * FROM collate_test1 WHERE b ~ '^abc';
+SELECT * FROM collate_test1 WHERE b ~ 'bc';
+SELECT * FROM collate_test1 WHERE b ~* '^abc$';
+SELECT * FROM collate_test1 WHERE b ~* '^abc';
+SELECT * FROM collate_test1 WHERE b ~* 'bc';
+
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US%icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'Ã¤bÃ§'), (10, 'ÃBÃ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+
+SELECT 'TÃ¼rkiye' COLLATE "en_US%icu" ~* 'KI' AS "true";
+SELECT 'TÃ¼rkiye' COLLATE "tr_TR%icu" ~* 'KI' AS "true";  -- true with ICU
+
+SELECT 'bÄ±t' ~* 'BIT' COLLATE "en_US%icu" AS "false";
+SELECT 'bÄ±t' ~* 'BIT' COLLATE "tr_TR%icu" AS "false";  -- false with ICU
+
+-- The following actually exercises the selectivity estimation for ~*.
+SELECT relname FROM pg_class WHERE relname ~* '^abc';
+
+
+-- to_char
+
+SET lc_time TO 'tr_TR';
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR%icu");
+
+
+-- backwards parsing
+
+CREATE VIEW collview1 AS SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
+CREATE VIEW collview2 AS SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
+CREATE VIEW collview3 AS SELECT a, lower((x || x) COLLATE "C") FROM collate_test10;
+
+SELECT table_name, view_definition FROM information_schema.views
+  WHERE table_name LIKE 'collview%' ORDER BY 1;
+
+
+-- collation propagation in various expression types
+
+SELECT a, coalesce(b, 'foo') FROM collate_test1 ORDER BY 2;
+SELECT a, coalesce(b, 'foo') FROM collate_test2 ORDER BY 2;
+SELECT a, coalesce(b, 'foo') FROM collate_test3 ORDER BY 2;
+SELECT a, lower(coalesce(x, 'foo')), lower(coalesce(y, 'foo')) FROM collate_test10;
+
+SELECT a, b, greatest(b, 'CCC') FROM collate_test1 ORDER BY 3;
+SELECT a, b, greatest(b, 'CCC') FROM collate_test2 ORDER BY 3;
+SELECT a, b, greatest(b, 'CCC') FROM collate_test3 ORDER BY 3;
+SELECT a, x, y, lower(greatest(x, 'foo')), lower(greatest(y, 'foo')) FROM collate_test10;
+
+SELECT a, nullif(b, 'abc') FROM collate_test1 ORDER BY 2;
+SELECT a, nullif(b, 'abc') FROM collate_test2 ORDER BY 2;
+SELECT a, nullif(b, 'abc') FROM collate_test3 ORDER BY 2;
+SELECT a, lower(nullif(x, 'foo')), lower(nullif(y, 'foo')) FROM collate_test10;
+
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test1 ORDER BY 2;
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test2 ORDER BY 2;
+SELECT a, CASE b WHEN 'abc' THEN 'abcd' ELSE b END FROM collate_test3 ORDER BY 2;
+
+CREATE DOMAIN testdomain AS text;
+SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;
+SELECT a, b::testdomain FROM collate_test2 ORDER BY 2;
+SELECT a, b::testdomain FROM collate_test3 ORDER BY 2;
+SELECT a, b::testdomain_sv FROM collate_test3 ORDER BY 2;
+SELECT a, lower(x::testdomain), lower(y::testdomain) FROM collate_test10;
+
+SELECT min(b), max(b) FROM collate_test1;
+SELECT min(b), max(b) FROM collate_test2;
+SELECT min(b), max(b) FROM collate_test3;
+
+SELECT array_agg(b ORDER BY b) FROM collate_test1;
+SELECT array_agg(b ORDER BY b) FROM collate_test2;
+SELECT array_agg(b ORDER BY b) FROM collate_test3;
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test1 ORDER BY 2;
+SELECT a, b FROM collate_test2 UNION SELECT a, b FROM collate_test2 ORDER BY 2;
+SELECT a, b FROM collate_test3 WHERE a < 4 INTERSECT SELECT a, b FROM collate_test3 WHERE a > 1 ORDER BY 2;
+SELECT a, b FROM collate_test3 EXCEPT SELECT a, b FROM collate_test3 WHERE a < 2 ORDER BY 2;
+
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- ok
+SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+SELECT a, b COLLATE "C" FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- ok
+SELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
+
+CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- fail
+
+-- ideally this would be a parse-time error, but for now it must be run-time:
+select x < y from collate_test10; -- fail
+select x || y from collate_test10; -- ok, because || is not collation aware
+select x, y from collate_test10 order by x || y; -- not so ok
+
+-- collation mismatch between recursive and non-recursive term
+WITH RECURSIVE foo(x) AS
+   (SELECT x FROM (VALUES('a' COLLATE "en_US%icu"),('b')) t(x)
+   UNION ALL
+   SELECT (x || 'c') COLLATE "de_DE%icu" FROM foo WHERE length(x) < 10)
+SELECT * FROM foo;
+
+
+-- casting
+
+SELECT CAST('42' AS text COLLATE "C");
+
+SELECT a, CAST(b AS varchar) FROM collate_test1 ORDER BY 2;
+SELECT a, CAST(b AS varchar) FROM collate_test2 ORDER BY 2;
+SELECT a, CAST(b AS varchar) FROM collate_test3 ORDER BY 2;
+
+
+-- propagation of collation in SQL functions (inlined and non-inlined cases)
+-- and plpgsql functions too
+
+CREATE FUNCTION mylt (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 $$;
+
+CREATE FUNCTION mylt_noninline (text, text) RETURNS boolean LANGUAGE sql
+    AS $$ select $1 < $2 limit 1 $$;
+
+CREATE FUNCTION mylt_plpgsql (text, text) RETURNS boolean LANGUAGE plpgsql
+    AS $$ begin return $1 < $2; end $$;
+
+SELECT a.b AS a, b.b AS b, a.b < b.b AS lt,
+       mylt(a.b, b.b), mylt_noninline(a.b, b.b), mylt_plpgsql(a.b, b.b)
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+
+SELECT a.b AS a, b.b AS b, a.b < b.b COLLATE "C" AS lt,
+       mylt(a.b, b.b COLLATE "C"), mylt_noninline(a.b, b.b COLLATE "C"),
+       mylt_plpgsql(a.b, b.b COLLATE "C")
+FROM collate_test1 a, collate_test1 b
+ORDER BY a.b, b.b;
+
+
+-- collation override in plpgsql
+
+CREATE FUNCTION mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+
+SELECT mylt2('a', 'B' collate "en_US%icu") as t, mylt2('a', 'B' collate "C") as f;
+
+CREATE OR REPLACE FUNCTION
+  mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
+declare
+  xx text COLLATE "POSIX" := x;
+  yy text := y;
+begin
+  return xx < yy;
+end
+$$;
+
+SELECT mylt2('a', 'B') as f;
+SELECT mylt2('a', 'B' collate "C") as fail; -- conflicting collations
+SELECT mylt2('a', 'B' collate "POSIX") as f;
+
+
+-- polymorphism
+
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test1)) ORDER BY 1;
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test2)) ORDER BY 1;
+SELECT * FROM unnest((SELECT array_agg(b ORDER BY b) FROM collate_test3)) ORDER BY 1;
+
+CREATE FUNCTION dup (anyelement) RETURNS anyelement
+    AS 'select $1' LANGUAGE sql;
+
+SELECT a, dup(b) FROM collate_test1 ORDER BY 2;
+SELECT a, dup(b) FROM collate_test2 ORDER BY 2;
+SELECT a, dup(b) FROM collate_test3 ORDER BY 2;
+
+
+-- indexes
+
+CREATE INDEX collate_test1_idx1 ON collate_test1 (b);
+CREATE INDEX collate_test1_idx2 ON collate_test1 (b COLLATE "C");
+CREATE INDEX collate_test1_idx3 ON collate_test1 ((b COLLATE "C")); -- this is different grammatically
+CREATE INDEX collate_test1_idx4 ON collate_test1 (((b||'foo') COLLATE "POSIX"));
+
+CREATE INDEX collate_test1_idx5 ON collate_test1 (a COLLATE "C"); -- fail
+CREATE INDEX collate_test1_idx6 ON collate_test1 ((a COLLATE "C")); -- fail
+
+SELECT relname, pg_get_indexdef(oid) FROM pg_class WHERE relname LIKE 'collate_test%_idx%' ORDER BY 1;
+
+
+-- schema manipulation commands
+
+CREATE ROLE regress_test_role;
+CREATE SCHEMA test_schema;
+
+-- We need to do this this way to cope with varying names for encodings:
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
+          quote_literal(current_setting('lc_collate')) || ');';
+END
+$$;
+CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
+do $$
+BEGIN
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
+          quote_literal(current_setting('lc_collate')) ||
+          ', lc_ctype = ' ||
+          quote_literal(current_setting('lc_ctype')) || ');';
+END
+$$;
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+--CREATE COLLATION testx (provider = icu, locale = 'nonsense'); -- never fails
+
+CREATE COLLATION test4 FROM nonsense;
+CREATE COLLATION test5 FROM test0;
+
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
+
+ALTER COLLATION test1 RENAME TO test11;
+ALTER COLLATION test0 RENAME TO test11; -- fail
+ALTER COLLATION test1 RENAME TO test22; -- fail
+
+ALTER COLLATION test11 OWNER TO regress_test_role;
+ALTER COLLATION test11 OWNER TO nonsense;
+ALTER COLLATION test11 SET SCHEMA test_schema;
+
+COMMENT ON COLLATION test0 IS 'US English';
+
+SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
+    FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
+    WHERE collname LIKE 'test%'
+    ORDER BY 1;
+
+DROP COLLATION test0, test_schema.test11, test5;
+DROP COLLATION test0; -- fail
+DROP COLLATION IF EXISTS test0;
+
+SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
+
+DROP SCHEMA test_schema;
+DROP ROLE regress_test_role;
+
+
+-- ALTER
+
+ALTER COLLATION "en_US%icu" REFRESH VERSION;
+
+
+-- dependencies
+
+CREATE COLLATION test0 FROM "C";
+
+CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
+CREATE DOMAIN collate_dep_dom1 AS text COLLATE test0;
+CREATE TYPE collate_dep_test2 AS (x int, y text COLLATE test0);
+CREATE VIEW collate_dep_test3 AS SELECT text 'foo' COLLATE test0 AS foo;
+CREATE TABLE collate_dep_test4t (a int, b text);
+CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
+
+DROP COLLATION test0 RESTRICT; -- fail
+DROP COLLATION test0 CASCADE;
+
+\d collate_dep_test1
+\d collate_dep_test2
+
+DROP TABLE collate_dep_test1, collate_dep_test4t;
+DROP TYPE collate_dep_test2;
+
+-- test range types and collations
+
+create type textrange_c as range(subtype=text, collation="C");
+create type textrange_en_us as range(subtype=text, collation="en_US%icu");
+
+select textrange_c('A','Z') @> 'b'::text;
+select textrange_en_us('A','Z') @> 'b'::text;
+
+drop type textrange_c;
+drop type textrange_en_us;
-- 
2.11.0

#52

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Peter Geoghegan (#49)

Re: ICU integration

On 1/9/17 3:45 PM, Peter Geoghegan wrote:

* I think it's worth looking into ucol_nextSortKeyPart(), and using
that as an alternative to ucol_getSortKey(). It doesn't seem any
harder, and when I tested it it was clearly faster. (I think that
ucol_nextSortKeyPart() is more or less intended to be used for
abbreviated keys.)

I will try to look into that.

* I think that it's not okay that convert_string_datum() still uses
strxfrm() without considering if it's an ICU build. That's why I
raised the idea of a pg_strxfrm() wrapper at one point.

That code works in a locale-agnostic way now, which might be
questionable, but it's not the fault of this patch, I think.

* Similarly, I think that check_strxfrm_bug() should have something
about ICU. It's looking for a particular bug in some very old version
of Solaris 8. At a minimum, check_strxfrm_bug() should now not run at
all (a broken OS strxfrm() shouldn't be a problem with ICU).

But we'll still be using strxfrm for the database default locale, so we
need to keep that check.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#53

Thomas Munro

thomas.munro@enterprisedb.com

almost 9 years ago

In reply to: Peter Geoghegan (#49)

Re: ICU integration

On Tue, Jan 10, 2017 at 9:45 AM, Peter Geoghegan <pg@heroku.com> wrote:

* I think it's worth looking into ucol_nextSortKeyPart(), and using
that as an alternative to ucol_getSortKey(). It doesn't seem any
harder, and when I tested it it was clearly faster. (I think that
ucol_nextSortKeyPart() is more or less intended to be used for
abbreviated keys.)

I assume (but haven't checked) that ucol_nextSortKeyPart accesses only
the start of the string via the UCharIterator passed in, unless you
have the rare reverse-accent-sort feature enabled (maybe used only in
fr_CA, it looks like it is required to scan the whole string looking
for the last accent). So I assume that uiter_setUTF8 and
ucol_nextSortKeyPart would usually do a small fixed amount of work,
whereas this patch's icu_to_uchar allocates space and converts the
whole variable length string every time.

On the other hand, I see that this patch's icu_to_uchar can deal with
encodings other than UTF8. At first glance, the UCharIterator
interface provides a way to make a UCharIterator that iterates over a
string of UChar or a string of UTF8, but I'm not sure how to make it
do character-by-character transcoding from arbitrary encodings via the
C API. It seems like that should be possible using ucnv_getNextUChar
as the source of transcoded characters, but I'm not sure how to wire
those two things together (in C++ I think you'd subclass
icu::CharacterIterator).

That's about abbreviation, but I note that you can also compare
strings using iterators with ucol_strcollIter, avoiding the need to
allocate and transcode up front. I have no idea whether that'd pay
off.

+   A change in collation definitions can lead to corrupt indexes and other
+   problems where the database system relies on stored objects having a
+   certain sort order.  Generally, this should be avoided, but it can happen
+   in legitimate circumstances, such as when
+   using <command>pg_upgrade</command> to upgrade to server binaries linked
+   with a newer version of ICU.  When this happens, all objects depending on
+   the collation should be rebuilt, for example,
+   using <command>REINDEX</command>.  When that is done, the collation version
+   can be refreshed using the command <literal>ALTER COLLATION ... REFRESH
+   VERSION</literal>.  This will update the system catalog to record the
+   current collator version and will make the warning go away.  Note that this
+   does not actually check whether all affected objects have been rebuilt

I think this is a pretty reasonable first approach to this problem.
It's a simple way to flag up a problem to the DBA, but leaves all the
responsibility for figuring out how to fix it to the DBA. I think we
should considering going further in later patches (tracking the
version used at last rebuild per index etc as discussed, so that the
condition is cleared only by rebuilding the affected things).

(REPARTITION anyone?)

As far as I know there are two things moving: ICU code and CLDR data.
Here we can see the CLDR versions being pulled into ICU:

http://bugs.icu-project.org/trac/log/icu/trunk/source/data/locales?rev=39273

Clearly when you upgrade your system from (say) Debian 8 to Debian 9
and the ICU major version changes we expect to have to REINDEX, but
does anyone know if such data changes are ever pulled into the minor
version package upgrades you get from regular apt-get update of (say)
a Debian 8 or CentOS 7 or FreeBSD 11 system? In other words, do we
expect collversion changes to happen basically any time in response to
regular system updates, or only when you're doing a major upgrade of
some kind, as the above-quoted documentation patch implies?

+ collversion = ntohl(*((uint32 *) versioninfo));

UVersionInfo is an array of four uint8_t. That doesn't sound like
something that needs to be bit-swizzled... is it really? Even if it
is arranged differently on different architectures, I'm not sure why
you care since we only ever use it to compare for equality on the same
system. But aside from that, I don't love this cast to uint32. I
wonder if we should use u_versionToString to respect boundaries a bit
better?

I have another motivation for wanting to model collation versions as
strings: I have been contemplating a version check for system-provided
collations too, piggy-backing on your work here. Obviously PostgreSQL
can't directly know anything about system collation versions, but I'm
thinking of a GUC system_collation_version_command which you could
optionally set to a script that knows how to inspect the local
operating system. For example, a package maintainer might be
interested in writing such a script that knows how to ask the package
system for a locale data version. A brute force approach that works
on many systems could be 'tar c /usr/share/local/*/LC_COLLATE | md5'.
A string would provide more leeway than a measly int32. That's
independent of this patch and you might hate the whole idea, but seems
to be the kind of thing you anticipated when you described collversion
as "[p]rovider-specific version of the collation".

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#54

Michael Paquier

michael.paquier@gmail.com

almost 9 years ago

In reply to: Peter Eisentraut (#51)

Re: ICU integration

On Wed, Jan 25, 2017 at 2:44 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 1/15/17 5:53 AM, Pavel Stehule wrote:

the regress test fails

Program received signal SIGSEGV, Segmentation fault.
0x00000000007bbc2b in pattern_char_isalpha (locale_is_c=0 '\000',
locale=0x1a73220, is_multibyte=1 '\001', c=97 'a') at selfuncs.c:5291
5291return isalpha_l((unsigned char) c, locale->lt);

Here is an updated patch that fixes this crash and is rebased on top of
recent changes.

Patch that still applies + no reviews = moved to CF 2017-03.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#55

Pavel Stehule

pavel.stehule@gmail.com

almost 9 years ago

In reply to: Peter Eisentraut (#51)

Re: ICU integration

2017-01-24 18:44 GMT+01:00 Peter Eisentraut <
peter.eisentraut@2ndquadrant.com>:

On 1/15/17 5:53 AM, Pavel Stehule wrote:

the regress test fails

Program received signal SIGSEGV, Segmentation fault.
0x00000000007bbc2b in pattern_char_isalpha (locale_is_c=0 '\000',
locale=0x1a73220, is_multibyte=1 '\001', c=97 'a') at selfuncs.c:5291
5291return isalpha_l((unsigned char) c, locale->lt);

Here is an updated patch that fixes this crash and is rebased on top of
recent changes.

This patch is not possible to compile on today master

commands/collationcmds.o: In function `AlterCollation':
/home/pavel/src/postgresql/src/backend/commands/collationcmds.c:297:
undefined reference to `CatalogUpdateIndexes'
collect2: error: ld returned 1 exit status

Regards

Pavel

Show quoted text

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#56

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Pavel Stehule (#55)

Re: ICU integration

Pavel Stehule <pavel.stehule@gmail.com> writes:

This patch is not possible to compile on today master
commands/collationcmds.o: In function `AlterCollation':
/home/pavel/src/postgresql/src/backend/commands/collationcmds.c:297:
undefined reference to `CatalogUpdateIndexes'

Evidently collateral damage from 2f5c9d9c9. But I'd suggest waiting
to fix it until you can also do s/simple_heap_delete/CatalogTupleDelete/
as I proposed in
/messages/by-id/462.1485902736@sss.pgh.pa.us

I'll go make that happen right now, now that I realize there are patch(es)
waiting on it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#57

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Tom Lane (#56)

Re: ICU integration

I wrote:

Evidently collateral damage from 2f5c9d9c9. But I'd suggest waiting
to fix it until you can also do s/simple_heap_delete/CatalogTupleDelete/
as I proposed in
/messages/by-id/462.1485902736@sss.pgh.pa.us
I'll go make that happen right now, now that I realize there are patch(es)
waiting on it.

Done. There's some more loose ends but they won't affect very many
call sites, so you should be able to rebase now.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#58

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Peter Eisentraut (#52)

Re: ICU integration

On 1/24/17 12:45 PM, Peter Eisentraut wrote:

On 1/9/17 3:45 PM, Peter Geoghegan wrote:

* I think it's worth looking into ucol_nextSortKeyPart(), and using
that as an alternative to ucol_getSortKey(). It doesn't seem any
harder, and when I tested it it was clearly faster. (I think that
ucol_nextSortKeyPart() is more or less intended to be used for
abbreviated keys.)

I will try to look into that.

I think I have this sorted out. What I don't understand, however, is
why varstr_abbrev_convert() makes a point of looping until the result of
strxfrm() fits into the output buffer (buf2), when we only need 8 bytes,
and we throw away the rest later. Wouldn't it work to just request 8 bytes?

If there is a problem with just requesting 8 bytes, then I'm wondering
how this would affect the ICU code branch.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#59

Peter Geoghegan

pg@bowt.ie

almost 9 years ago

In reply to: Peter Eisentraut (#58)

Re: ICU integration

On Thu, Feb 9, 2017 at 7:58 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

I think I have this sorted out. What I don't understand, however, is
why varstr_abbrev_convert() makes a point of looping until the result of
strxfrm() fits into the output buffer (buf2), when we only need 8 bytes,
and we throw away the rest later. Wouldn't it work to just request 8 bytes?

Maybe. We do that because strxfrm() is not required by the standard to
produce well defined contents for the buffer when the return value
indicates that it didn't fit entirely. This is a standard idiom, I
think.

If there is a problem with just requesting 8 bytes, then I'm wondering
how this would affect the ICU code branch.

This must be fine with ICU's ucol_nextSortKeyPart(), because it is
designed for the express purpose of producing only a few bytes of the
final blob at a time.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#60

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Peter Eisentraut (#51)

1 attachment(s)

Re: ICU integration

Updated and rebased patch.

Significant changes:

- Changed collversion to type text

- Changed pg_locale_t to a union

- Use ucol_getAvailable() instead of uloc_getAvailable(), so the set of
initial collations is smaller now, because redundancies are eliminated.

- Added keyword variants to predefined ICU collations (so you get
"de_phonebook%icu", for example) (So the initial set of collations is
bigger now. :) )

- Predefined ICU collations have a comment now, so \dOS+ is useful.

- Use ucol_nextSortKeyPart() for abbreviated keys

- Enhanced tests and documentation

I believe all issues raised in reviews have been addressed.

Discussion points:

- Naming of collations: Are we happy with the "de%icu" naming? I might
have come up with that while reviewing the IPv6 zone index patch. ;-)
An alternative might be "de$icu" for more Oracle vibe and avoiding the
need for double quotes in some cases. (But we have mixed-case names
like "de_AT%icu", so further changes would be necessary to fully get rid
of the need for quoting.) A more radical alternative would be to
install ICU locales in a different schema and use the search_path, or
even have a separate search path setting for collations only. Which
leads to ...

- Selecting default collation provider: Maybe we want a setting, say in
initdb, to determine which provider's collations get the "good" names?
Maybe not necessary for this release, but something to think about.

- Currently (in this patch), we check a collation's version when it is
first used. But, say, after pg_upgrade, you might want to check all of
them right away. What might be a good interface for that? (Possibly,
we only have to check the ones actually in use, and we have dependency
information for that.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v4-0001-ICU-support.patchtext/x-patch; name=v4-0001-ICU-support.patchDownload

From adac0fadbcc302845be627b9f101ca8083482e60 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Thu, 16 Feb 2017 00:06:16 -0500
Subject: [PATCH v4] ICU support

Add a column collprovider to pg_collation that determines which library
provides the collation data.  The existing choices are default and libc,
and this adds an icu choice, which uses the ICU4C library.

The pg_locale_t type is changed to a union that contains the
provider-specific locale handles.  Users of locale information are
changed to look into that struct for the appropriate handle to use.

Also add a collversion column that records the version of the collation
when it is created, and check at run time whether it is still the same.
This detects potentially incompatible library upgrades that can corrupt
indexes and other structures.  This is currently only supported by
ICU-provided collations.

initdb initializes the default collation set as before from the `locale
-a` output but also adds all available ICU locales with a "%icu"
appended.

Currently, ICU-provided collations can only be explicitly named
collations.  The global database locales are still always libc-provided.
---
 aclocal.m4                                         |   1 +
 config/pkg.m4                                      | 275 +++++++++++++
 configure                                          | 313 ++++++++++++++
 configure.in                                       |  35 ++
 doc/src/sgml/catalogs.sgml                         |  19 +
 doc/src/sgml/charset.sgml                          | 172 +++++++-
 doc/src/sgml/installation.sgml                     |  14 +
 doc/src/sgml/ref/alter_collation.sgml              |  42 ++
 doc/src/sgml/ref/create_collation.sgml             |  16 +-
 src/Makefile.global.in                             |   4 +
 src/backend/Makefile                               |   2 +-
 src/backend/catalog/pg_collation.c                 |  26 +-
 src/backend/commands/collationcmds.c               | 205 +++++++++-
 src/backend/common.mk                              |   2 +
 src/backend/nodes/copyfuncs.c                      |  13 +
 src/backend/nodes/equalfuncs.c                     |  11 +
 src/backend/parser/gram.y                          |  18 +-
 src/backend/regex/regc_pg_locale.c                 | 110 +++--
 src/backend/tcop/utility.c                         |   8 +
 src/backend/utils/adt/formatting.c                 | 453 +++++++++++----------
 src/backend/utils/adt/like.c                       |  53 ++-
 src/backend/utils/adt/pg_locale.c                  | 259 ++++++++++--
 src/backend/utils/adt/selfuncs.c                   |   8 +-
 src/backend/utils/adt/varlena.c                    | 155 ++++++-
 src/backend/utils/mb/encnames.c                    |  76 ++++
 src/bin/initdb/initdb.c                            |   3 +-
 src/bin/pg_dump/pg_dump.c                          |  55 ++-
 src/bin/psql/describe.c                            |   7 +-
 src/include/catalog/pg_collation.h                 |  25 +-
 src/include/catalog/pg_collation_fn.h              |   1 +
 src/include/commands/collationcmds.h               |   1 +
 src/include/mb/pg_wchar.h                          |   6 +
 src/include/nodes/nodes.h                          |   1 +
 src/include/nodes/parsenodes.h                     |  11 +
 src/include/pg_config.h.in                         |   6 +
 src/include/utils/pg_locale.h                      |  32 +-
 src/test/regress/GNUmakefile                       |   3 +
 .../{collate.linux.utf8.out => collate.icu.out}    | 188 +++++----
 src/test/regress/expected/collate.linux.utf8.out   |  78 +++-
 .../{collate.linux.utf8.sql => collate.icu.sql}    |  99 +++--
 src/test/regress/sql/collate.linux.utf8.sql        |  31 ++
 41 files changed, 2344 insertions(+), 493 deletions(-)
 create mode 100644 config/pkg.m4
 copy src/test/regress/expected/{collate.linux.utf8.out => collate.icu.out} (80%)
 copy src/test/regress/sql/{collate.linux.utf8.sql => collate.icu.sql} (82%)

diff --git a/aclocal.m4 b/aclocal.m4
index 6f930b6fc1..5ca902b6a2 100644
--- a/aclocal.m4
+++ b/aclocal.m4
@@ -7,6 +7,7 @@ m4_include([config/docbook.m4])
 m4_include([config/general.m4])
 m4_include([config/libtool.m4])
 m4_include([config/perl.m4])
+m4_include([config/pkg.m4])
 m4_include([config/programs.m4])
 m4_include([config/python.m4])
 m4_include([config/tcl.m4])
diff --git a/config/pkg.m4 b/config/pkg.m4
new file mode 100644
index 0000000000..82bea96ee7
--- /dev/null
+++ b/config/pkg.m4
@@ -0,0 +1,275 @@
+dnl pkg.m4 - Macros to locate and utilise pkg-config.   -*- Autoconf -*-
+dnl serial 11 (pkg-config-0.29.1)
+dnl
+dnl Copyright Â© 2004 Scott James Remnant <scott@netsplit.com>.
+dnl Copyright Â© 2012-2015 Dan Nicholson <dbn.lists@gmail.com>
+dnl
+dnl This program is free software; you can redistribute it and/or modify
+dnl it under the terms of the GNU General Public License as published by
+dnl the Free Software Foundation; either version 2 of the License, or
+dnl (at your option) any later version.
+dnl
+dnl This program is distributed in the hope that it will be useful, but
+dnl WITHOUT ANY WARRANTY; without even the implied warranty of
+dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+dnl General Public License for more details.
+dnl
+dnl You should have received a copy of the GNU General Public License
+dnl along with this program; if not, write to the Free Software
+dnl Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+dnl 02111-1307, USA.
+dnl
+dnl As a special exception to the GNU General Public License, if you
+dnl distribute this file as part of a program that contains a
+dnl configuration script generated by Autoconf, you may include it under
+dnl the same distribution terms that you use for the rest of that
+dnl program.
+
+dnl PKG_PREREQ(MIN-VERSION)
+dnl -----------------------
+dnl Since: 0.29
+dnl
+dnl Verify that the version of the pkg-config macros are at least
+dnl MIN-VERSION. Unlike PKG_PROG_PKG_CONFIG, which checks the user's
+dnl installed version of pkg-config, this checks the developer's version
+dnl of pkg.m4 when generating configure.
+dnl
+dnl To ensure that this macro is defined, also add:
+dnl m4_ifndef([PKG_PREREQ],
+dnl     [m4_fatal([must install pkg-config 0.29 or later before running autoconf/autogen])])
+dnl
+dnl See the "Since" comment for each macro you use to see what version
+dnl of the macros you require.
+m4_defun([PKG_PREREQ],
+[m4_define([PKG_MACROS_VERSION], [0.29.1])
+m4_if(m4_version_compare(PKG_MACROS_VERSION, [$1]), -1,
+    [m4_fatal([pkg.m4 version $1 or higher is required but ]PKG_MACROS_VERSION[ found])])
+])dnl PKG_PREREQ
+
+dnl PKG_PROG_PKG_CONFIG([MIN-VERSION])
+dnl ----------------------------------
+dnl Since: 0.16
+dnl
+dnl Search for the pkg-config tool and set the PKG_CONFIG variable to
+dnl first found in the path. Checks that the version of pkg-config found
+dnl is at least MIN-VERSION. If MIN-VERSION is not specified, 0.9.0 is
+dnl used since that's the first version where most current features of
+dnl pkg-config existed.
+AC_DEFUN([PKG_PROG_PKG_CONFIG],
+[m4_pattern_forbid([^_?PKG_[A-Z_]+$])
+m4_pattern_allow([^PKG_CONFIG(_(PATH|LIBDIR|SYSROOT_DIR|ALLOW_SYSTEM_(CFLAGS|LIBS)))?$])
+m4_pattern_allow([^PKG_CONFIG_(DISABLE_UNINSTALLED|TOP_BUILD_DIR|DEBUG_SPEW)$])
+AC_ARG_VAR([PKG_CONFIG], [path to pkg-config utility])
+AC_ARG_VAR([PKG_CONFIG_PATH], [directories to add to pkg-config's search path])
+AC_ARG_VAR([PKG_CONFIG_LIBDIR], [path overriding pkg-config's built-in search path])
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	AC_PATH_TOOL([PKG_CONFIG], [pkg-config])
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=m4_default([$1], [0.9.0])
+	AC_MSG_CHECKING([pkg-config is at least version $_pkg_min_version])
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		AC_MSG_RESULT([yes])
+	else
+		AC_MSG_RESULT([no])
+		PKG_CONFIG=""
+	fi
+fi[]dnl
+])dnl PKG_PROG_PKG_CONFIG
+
+dnl PKG_CHECK_EXISTS(MODULES, [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------------------------------
+dnl Since: 0.18
+dnl
+dnl Check to see whether a particular set of modules exists. Similar to
+dnl PKG_CHECK_MODULES(), but does not set variables or print errors.
+dnl
+dnl Please remember that m4 expands AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+dnl only at the first occurence in configure.ac, so if the first place
+dnl it's called might be skipped (such as if it is within an "if", you
+dnl have to call PKG_CHECK_EXISTS manually
+AC_DEFUN([PKG_CHECK_EXISTS],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+if test -n "$PKG_CONFIG" && \
+    AC_RUN_LOG([$PKG_CONFIG --exists --print-errors "$1"]); then
+  m4_default([$2], [:])
+m4_ifvaln([$3], [else
+  $3])dnl
+fi])
+
+dnl _PKG_CONFIG([VARIABLE], [COMMAND], [MODULES])
+dnl ---------------------------------------------
+dnl Internal wrapper calling pkg-config via PKG_CONFIG and setting
+dnl pkg_failed based on the result.
+m4_define([_PKG_CONFIG],
+[if test -n "$$1"; then
+    pkg_cv_[]$1="$$1"
+ elif test -n "$PKG_CONFIG"; then
+    PKG_CHECK_EXISTS([$3],
+                     [pkg_cv_[]$1=`$PKG_CONFIG --[]$2 "$3" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes ],
+		     [pkg_failed=yes])
+ else
+    pkg_failed=untried
+fi[]dnl
+])dnl _PKG_CONFIG
+
+dnl _PKG_SHORT_ERRORS_SUPPORTED
+dnl ---------------------------
+dnl Internal check to see if pkg-config supports short errors.
+AC_DEFUN([_PKG_SHORT_ERRORS_SUPPORTED],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi[]dnl
+])dnl _PKG_SHORT_ERRORS_SUPPORTED
+
+
+dnl PKG_CHECK_MODULES(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl --------------------------------------------------------------
+dnl Since: 0.4.0
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES might not happen, you should be sure to include an
+dnl explicit call to PKG_PROG_PKG_CONFIG in your configure.ac
+AC_DEFUN([PKG_CHECK_MODULES],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1][_CFLAGS], [C compiler flags for $1, overriding pkg-config])dnl
+AC_ARG_VAR([$1][_LIBS], [linker flags for $1, overriding pkg-config])dnl
+
+pkg_failed=no
+AC_MSG_CHECKING([for $1])
+
+_PKG_CONFIG([$1][_CFLAGS], [cflags], [$2])
+_PKG_CONFIG([$1][_LIBS], [libs], [$2])
+
+m4_define([_PKG_TEXT], [Alternatively, you may set the environment variables $1[]_CFLAGS
+and $1[]_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.])
+
+if test $pkg_failed = yes; then
+   	AC_MSG_RESULT([no])
+        _PKG_SHORT_ERRORS_SUPPORTED
+        if test $_pkg_short_errors_supported = yes; then
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "$2" 2>&1`
+        else 
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "$2" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$$1[]_PKG_ERRORS" >&AS_MESSAGE_LOG_FD
+
+	m4_default([$4], [AC_MSG_ERROR(
+[Package requirements ($2) were not met:
+
+$$1_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+_PKG_TEXT])[]dnl
+        ])
+elif test $pkg_failed = untried; then
+     	AC_MSG_RESULT([no])
+	m4_default([$4], [AC_MSG_FAILURE(
+[The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+_PKG_TEXT
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.])[]dnl
+        ])
+else
+	$1[]_CFLAGS=$pkg_cv_[]$1[]_CFLAGS
+	$1[]_LIBS=$pkg_cv_[]$1[]_LIBS
+        AC_MSG_RESULT([yes])
+	$3
+fi[]dnl
+])dnl PKG_CHECK_MODULES
+
+
+dnl PKG_CHECK_MODULES_STATIC(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl ---------------------------------------------------------------------
+dnl Since: 0.29
+dnl
+dnl Checks for existence of MODULES and gathers its build flags with
+dnl static libraries enabled. Sets VARIABLE-PREFIX_CFLAGS from --cflags
+dnl and VARIABLE-PREFIX_LIBS from --libs.
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES_STATIC might not happen, you should be sure to
+dnl include an explicit call to PKG_PROG_PKG_CONFIG in your
+dnl configure.ac.
+AC_DEFUN([PKG_CHECK_MODULES_STATIC],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+_save_PKG_CONFIG=$PKG_CONFIG
+PKG_CONFIG="$PKG_CONFIG --static"
+PKG_CHECK_MODULES($@)
+PKG_CONFIG=$_save_PKG_CONFIG[]dnl
+])dnl PKG_CHECK_MODULES_STATIC
+
+
+dnl PKG_INSTALLDIR([DIRECTORY])
+dnl -------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable pkgconfigdir as the location where a module
+dnl should install pkg-config .pc files. By default the directory is
+dnl $libdir/pkgconfig, but the default can be changed by passing
+dnl DIRECTORY. The user can override through the --with-pkgconfigdir
+dnl parameter.
+AC_DEFUN([PKG_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${libdir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([pkgconfigdir],
+    [AS_HELP_STRING([--with-pkgconfigdir], pkg_description)],,
+    [with_pkgconfigdir=]pkg_default)
+AC_SUBST([pkgconfigdir], [$with_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_INSTALLDIR
+
+
+dnl PKG_NOARCH_INSTALLDIR([DIRECTORY])
+dnl --------------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable noarch_pkgconfigdir as the location where a
+dnl module should install arch-independent pkg-config .pc files. By
+dnl default the directory is $datadir/pkgconfig, but the default can be
+dnl changed by passing DIRECTORY. The user can override through the
+dnl --with-noarch-pkgconfigdir parameter.
+AC_DEFUN([PKG_NOARCH_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${datadir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config arch-independent installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([noarch-pkgconfigdir],
+    [AS_HELP_STRING([--with-noarch-pkgconfigdir], pkg_description)],,
+    [with_noarch_pkgconfigdir=]pkg_default)
+AC_SUBST([noarch_pkgconfigdir], [$with_noarch_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_NOARCH_INSTALLDIR
+
+
+dnl PKG_CHECK_VAR(VARIABLE, MODULE, CONFIG-VARIABLE,
+dnl [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------
+dnl Since: 0.28
+dnl
+dnl Retrieves the value of the pkg-config variable for the given module.
+AC_DEFUN([PKG_CHECK_VAR],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1], [value of $3 for $2, overriding pkg-config])dnl
+
+_PKG_CONFIG([$1], [variable="][$3]["], [$2])
+AS_VAR_COPY([$1], [pkg_cv_][$1])
+
+AS_VAR_IF([$1], [""], [$5], [$4])dnl
+])dnl PKG_CHECK_VAR
diff --git a/configure b/configure
index 8468417f69..a43eee1583 100755
--- a/configure
+++ b/configure
@@ -715,6 +715,12 @@ krb_srvtab
 with_python
 with_perl
 with_tcl
+ICU_LIBS
+ICU_CFLAGS
+PKG_CONFIG_LIBDIR
+PKG_CONFIG_PATH
+PKG_CONFIG
+with_icu
 enable_thread_safety
 INCLUDES
 autodepend
@@ -821,6 +827,7 @@ with_CC
 enable_depend
 enable_cassert
 enable_thread_safety
+with_icu
 with_tcl
 with_tclconfig
 with_perl
@@ -856,6 +863,11 @@ LDFLAGS
 LIBS
 CPPFLAGS
 CPP
+PKG_CONFIG
+PKG_CONFIG_PATH
+PKG_CONFIG_LIBDIR
+ICU_CFLAGS
+ICU_LIBS
 LDFLAGS_EX
 LDFLAGS_SL
 DOCBOOKSTYLE'
@@ -1511,6 +1523,7 @@ Optional Packages:
   --with-wal-segsize=SEGSIZE
                           set WAL segment size in MB [16]
   --with-CC=CMD           set compiler (deprecated)
+  --with-icu              build with ICU support
   --with-tcl              build Tcl modules (PL/Tcl)
   --with-tclconfig=DIR    tclConfig.sh is in DIR
   --with-perl             build Perl modules (PL/Perl)
@@ -1546,6 +1559,13 @@ Some influential environment variables:
   CPPFLAGS    (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if
               you have headers in a nonstandard directory <include dir>
   CPP         C preprocessor
+  PKG_CONFIG  path to pkg-config utility
+  PKG_CONFIG_PATH
+              directories to add to pkg-config's search path
+  PKG_CONFIG_LIBDIR
+              path overriding pkg-config's built-in search path
+  ICU_CFLAGS  C compiler flags for ICU, overriding pkg-config
+  ICU_LIBS    linker flags for ICU, overriding pkg-config
   LDFLAGS_EX  extra linker flags for linking executables only
   LDFLAGS_SL  extra linker flags for linking shared libraries only
   DOCBOOKSTYLE
@@ -5368,6 +5388,255 @@ $as_echo "$enable_thread_safety" >&6; }
 
 
 #
+# ICU
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with ICU support" >&5
+$as_echo_n "checking whether to build with ICU support... " >&6; }
+
+
+
+# Check whether --with-icu was given.
+if test "${with_icu+set}" = set; then :
+  withval=$with_icu;
+  case $withval in
+    yes)
+
+$as_echo "#define USE_ICU 1" >>confdefs.h
+
+      ;;
+    no)
+      :
+      ;;
+    *)
+      as_fn_error $? "no argument expected for --with-icu option" "$LINENO" 5
+      ;;
+  esac
+
+else
+  with_icu=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_icu" >&5
+$as_echo "$with_icu" >&6; }
+
+
+if test "$with_icu" = yes; then
+
+
+
+
+
+
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	if test -n "$ac_tool_prefix"; then
+  # Extract the first word of "${ac_tool_prefix}pkg-config", so it can be a program name with args.
+set dummy ${ac_tool_prefix}pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_PKG_CONFIG="$PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+PKG_CONFIG=$ac_cv_path_PKG_CONFIG
+if test -n "$PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $PKG_CONFIG" >&5
+$as_echo "$PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+fi
+if test -z "$ac_cv_path_PKG_CONFIG"; then
+  ac_pt_PKG_CONFIG=$PKG_CONFIG
+  # Extract the first word of "pkg-config", so it can be a program name with args.
+set dummy pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_ac_pt_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $ac_pt_PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_ac_pt_PKG_CONFIG="$ac_pt_PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_ac_pt_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+ac_pt_PKG_CONFIG=$ac_cv_path_ac_pt_PKG_CONFIG
+if test -n "$ac_pt_PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_pt_PKG_CONFIG" >&5
+$as_echo "$ac_pt_PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+  if test "x$ac_pt_PKG_CONFIG" = x; then
+    PKG_CONFIG=""
+  else
+    case $cross_compiling:$ac_tool_warned in
+yes:)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5
+$as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;}
+ac_tool_warned=yes ;;
+esac
+    PKG_CONFIG=$ac_pt_PKG_CONFIG
+  fi
+else
+  PKG_CONFIG="$ac_cv_path_PKG_CONFIG"
+fi
+
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=0.9.0
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: checking pkg-config is at least version $_pkg_min_version" >&5
+$as_echo_n "checking pkg-config is at least version $_pkg_min_version... " >&6; }
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+	else
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+		PKG_CONFIG=""
+	fi
+fi
+
+pkg_failed=no
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for ICU" >&5
+$as_echo_n "checking for ICU... " >&6; }
+
+if test -n "$ICU_CFLAGS"; then
+    pkg_cv_ICU_CFLAGS="$ICU_CFLAGS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_CFLAGS=`$PKG_CONFIG --cflags "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+if test -n "$ICU_LIBS"; then
+    pkg_cv_ICU_LIBS="$ICU_LIBS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_LIBS=`$PKG_CONFIG --libs "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+
+
+
+if test $pkg_failed = yes; then
+   	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi
+        if test $_pkg_short_errors_supported = yes; then
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        else
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$ICU_PKG_ERRORS" >&5
+
+	as_fn_error $? "Package requirements (icu-uc icu-i18n) were not met:
+
+$ICU_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details." "$LINENO" 5
+elif test $pkg_failed = untried; then
+     	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+	{ { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error $? "The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.
+See \`config.log' for more details" "$LINENO" 5; }
+else
+	ICU_CFLAGS=$pkg_cv_ICU_CFLAGS
+	ICU_LIBS=$pkg_cv_ICU_LIBS
+        { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+
+fi
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with Tcl" >&5
@@ -13461,6 +13730,50 @@ fi
 done
 
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ucol_strcollUTF8" >&5
+$as_echo_n "checking for ucol_strcollUTF8... " >&6; }
+if ${pgac_cv_func_ucol_strcollUTF8+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <unicode/ucol.h>
+
+int
+main ()
+{
+ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv_func_ucol_strcollUTF8=yes
+else
+  pgac_cv_func_ucol_strcollUTF8=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_func_ucol_strcollUTF8" >&5
+$as_echo "$pgac_cv_func_ucol_strcollUTF8" >&6; }
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+
+$as_echo "#define HAVE_UCOL_STRCOLLUTF8 1" >>confdefs.h
+
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/configure.in b/configure.in
index 01b618c931..8b1d957e44 100644
--- a/configure.in
+++ b/configure.in
@@ -614,6 +614,19 @@ AC_MSG_RESULT([$enable_thread_safety])
 AC_SUBST(enable_thread_safety)
 
 #
+# ICU
+#
+AC_MSG_CHECKING([whether to build with ICU support])
+PGAC_ARG_BOOL(with, icu, no, [build with ICU support],
+              [AC_DEFINE([USE_ICU], 1, [Define to build with ICU support. (--with-icu)])])
+AC_MSG_RESULT([$with_icu])
+AC_SUBST(with_icu)
+
+if test "$with_icu" = yes; then
+  PKG_CHECK_MODULES(ICU, icu-uc icu-i18n)
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 AC_MSG_CHECKING([whether to build with Tcl])
@@ -1640,6 +1653,28 @@ fi
 AC_CHECK_FUNCS([strtoll strtoq], [break])
 AC_CHECK_FUNCS([strtoull strtouq], [break])
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  AC_CACHE_CHECK([for ucol_strcollUTF8], [pgac_cv_func_ucol_strcollUTF8],
+[ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+AC_LINK_IFELSE([AC_LANG_PROGRAM(
+[#include <unicode/ucol.h>
+],
+[ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);])],
+[pgac_cv_func_ucol_strcollUTF8=yes],
+[pgac_cv_func_ucol_strcollUTF8=no])
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS])
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+    AC_DEFINE([HAVE_UCOL_STRCOLLUTF8], 1, [Define to 1 if you have the `ucol_strcollUTF8' function.])
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 96cb9185c2..65b7f081ec 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2004,6 +2004,14 @@ <title><structname>pg_collation</> Columns</title>
      </row>
 
      <row>
+      <entry><structfield>collprovider</structfield></entry>
+      <entry><type>char</type></entry>
+      <entry></entry>
+      <entry>Provider of the collation: <literal>d</literal> = database
+       default, <literal>c</literal> = libc, <literal>i</literal> = icu</entry>
+     </row>
+
+     <row>
       <entry><structfield>collencoding</structfield></entry>
       <entry><type>int4</type></entry>
       <entry></entry>
@@ -2024,6 +2032,17 @@ <title><structname>pg_collation</> Columns</title>
       <entry></entry>
       <entry><symbol>LC_CTYPE</> for this collation object</entry>
      </row>
+
+     <row>
+      <entry><structfield>collversion</structfield></entry>
+      <entry><type>text</type></entry>
+      <entry></entry>
+      <entry>
+       Provider-specific version of the collation.  This is recorded when the
+       collation is created and then checked when it is used, to detect
+       changes in the collation definition that could lead to data corruption.
+      </entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 2aba0fc528..01a38931fa 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -500,21 +500,45 @@ <title>Concepts</title>
    <title>Managing Collations</title>
 
    <para>
-    A collation is an SQL schema object that maps an SQL name to
-    operating system locales.  In particular, it maps to a combination
-    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>.  (As
+    A collation is an SQL schema object that maps an SQL name to locales
+    provided by libraries installed in the operating system.  A collation
+    definition has a <firstterm>provider</firstterm> that specifies which
+    library supplies the locale data.  One standard provider name
+    is <literal>libc</literal>, which uses the locales provided by the
+    operating system C library.  These are the locales that most tools
+    provided by the operating system use.  Another provider
+    is <literal>icu</literal>, which uses the external
+    ICU<indexterm><primary>ICU</></> library.  Support for ICU has to be
+    configured when PostgreSQL is built.
+   </para>
+
+   <para>
+    A collation object provided by <literal>libc</literal> maps to a combination
+    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol> settings.  (As
     the name would suggest, the main purpose of a collation is to set
     <symbol>LC_COLLATE</symbol>, which controls the sort order.  But
     it is rarely necessary in practice to have an
     <symbol>LC_CTYPE</symbol> setting that is different from
     <symbol>LC_COLLATE</symbol>, so it is more convenient to collect
     these under one concept than to create another infrastructure for
-    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a collation
+    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a <literal>libc</literal> collation
     is tied to a character set encoding (see <xref linkend="multibyte">).
     The same collation name may exist for different encodings.
    </para>
 
    <para>
+    A collation provided by <literal>icu</literal> maps to a named collator
+    provided by the ICU library.  ICU does not support
+    separate <quote>collate</quote> and <quote>ctype</quote> settings, so they
+    are always the same.  Also, ICU collations are independent of the
+    encoding, so there is always only one ICU collation for a given name in a
+    database.
+   </para>
+
+   <sect3>
+    <title>Standard Collations</title>
+
+   <para>
     On all platforms, the collations named <literal>default</>,
     <literal>C</>, and <literal>POSIX</> are available.  Additional
     collations may be available depending on operating system support.
@@ -528,12 +552,36 @@ <title>Managing Collations</title>
    </para>
 
    <para>
+    Additionally, the SQL standard collation name <literal>ucs_basic</literal>
+    is available for encoding <literal>UTF8</literal>.  It is equivalent
+    to <literal>C</literal> and sorts by Unicode code point.
+   </para>
+  </sect3>
+
+  <sect3>
+   <title>Predefined Collations</title>
+
+   <para>
     If the operating system provides support for using multiple locales
     within a single program (<function>newlocale</> and related functions),
+    or support for ICU is configured,
     then when a database cluster is initialized, <command>initdb</command>
     populates the system catalog <literal>pg_collation</literal> with
     collations based on all the locales it finds on the operating
-    system at the time.  For example, the operating system might
+    system at the time.
+   </para>
+
+   <para>
+    The inspect the currently available locales, use the query <literal>SELECT
+    * FROM pg_collation</literal>, or the command <command>\dOS+</command>
+    in <application>psql</application>.
+   </para>
+
+  <sect4>
+   <title>libc collations</title>
+
+   <para>
+    For example, the operating system might
     provide a locale named <literal>de_DE.utf8</literal>.
     <command>initdb</command> would then create a collation named
     <literal>de_DE.utf8</literal> for encoding <literal>UTF8</literal>
@@ -544,17 +592,20 @@ <title>Managing Collations</title>
     under the name <literal>de_DE</literal>, which is less cumbersome
     to write and makes the name less encoding-dependent.  Note that,
     nevertheless, the initial set of collation names is
-    platform-dependent.
+    platform-dependent.  Collations provided by ICU are created
+    with <literal>%icu</literal> appended, so <literal>de%icu</literal> for
+    example.
    </para>
 
    <para>
-    In case a collation is needed that has different values for
-    <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>, a new
-    collation may be created using
-    the <xref linkend="sql-createcollation"> command.  That command
-    can also be used to create a new collation from an existing
-    collation, which can be useful to be able to use
-    operating-system-independent collation names in applications.
+    The default set of collations provided by <literal>libc</literal> map
+    directly to the locales installed in the operating system, which can be
+    listed using the command <literal>locale -a</literal>.
+    In case a <literal>libc</literal> collation is needed that has different values for
+    <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>, or new locales
+    are installed in the operating system after the database system was
+    initialized, then a new collation may be created using
+    the <xref linkend="sql-createcollation"> command.
    </para>
 
    <para>
@@ -566,8 +617,8 @@ <title>Managing Collations</title>
     Use of the stripped collation names is recommended, since it will
     make one less thing you need to change if you decide to change to
     another database encoding.  Note however that the <literal>default</>,
-    <literal>C</>, and <literal>POSIX</> collations can be used
-    regardless of the database encoding.
+    <literal>C</>, and <literal>POSIX</> collations, as well as all collations
+    provided by ICU can be used regardless of the database encoding.
    </para>
 
    <para>
@@ -581,6 +632,97 @@ <title>Managing Collations</title>
     collations have identical behaviors.  Mixing stripped and non-stripped
     collation names is therefore not recommended.
    </para>
+  </sect4>
+
+  <sect4>
+   <title>ICU collations</title>
+
+   <para>
+    With ICU, it is not sensible to enumerate all possible locale names.  ICU
+    uses a particular naming system for locales, but there are many more ways
+    to name a locale than there are actually distinct locales.  (In fact, any
+    string will be accepted as a locale name.)
+    See <ulink url="http://userguide.icu-project.org/locale"></ulink> for
+    information on ICU locale naming.  <command>initdb</command> uses the ICU
+    APIs to extract a set of locales with distinct collation rules to populate
+    the initial set of collations.  Here are some examples collations that
+    might be created:
+
+    <variablelist>
+     <varlistentry>
+      <term><literal>de%icu</literal></term>
+      <listitem>
+       <para>German collation, default variant</para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de_phonebook%icu</literal></term>
+      <listitem>
+       <para>German collation, phone book variant</para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de_AT%icu</literal></term>
+      <listitem>
+       <para>German collation for Austria, default variant</para>
+       <para>
+        (Note that as of this writing, there is no,
+        say, <literal>de_DE%icu</literal> or <literal>de_CH%icu</literal>,
+        since those are equivalent to <literal>de%icu</literal>.)
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de_AT_phonebook%icu</literal></term>
+      <listitem>
+       <para>German collation for Austria, phone book variant</para>
+      </listitem>
+     </varlistentry>
+     <varlistentry>
+      <term><literal>root%icu</literal></term>
+      <listitem>
+       <para>
+        ICU <quote>root</quote> collation.  Use this to get a reasonable
+        language-agnostic sort order.
+       </para>
+      </listitem>
+     </varlistentry>
+    </variablelist>
+   </para>
+
+   <para>
+    Some (less frequently used) encodings are not supported by ICU.  If the
+    database cluster was initialized with such an encoding, no ICU collations
+    will be predefined.
+   </para>
+   </sect4>
+   </sect3>
+
+   <sect3>
+   <title>Copying Collations</title>
+
+   <para>
+    The command <xref linkend="sql-createcollation"> can also be used to
+    create a new collation from an existing collation, which can be useful to
+    be able to use operating-system-independent collation names in
+    applications, create compatibility names, or use an ICU-provided collation
+    under a suffix-less name.  For example:
+<programlisting>
+CREATE COLLATION german FROM "de_DE";
+CREATE COLLATION french FROM "fr%icu";
+CREATE COLLATION "de_DE%icu" FROM "de%icu";
+</programlisting>
+   </para>
+
+   <para>
+    The standard and predefined collations are in the
+    schema <literal>pg_catalog</literal>, like all predefined objects.
+    User-defined collations should be created in user schemas.  This also
+    ensures that they are saved by <command>pg_dump</command>.
+   </para>
   </sect2>
  </sect1>
 
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 4431ed75a9..5c96cba51a 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -771,6 +771,20 @@ <title>Configuration</title>
       </varlistentry>
 
       <varlistentry>
+       <term><option>--with-icu</option></term>
+       <listitem>
+        <para>
+         Build with support for
+         the <productname>ICU</productname><indexterm><primary>ICU</></>
+         library.  This requires the <productname>ICU4C</productname> package
+         as well
+         as <productname>pkg-config</productname><indexterm><primary>pkg-config</></>
+         to be installed.
+        </para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
        <term><option>--with-openssl</option>
        <indexterm>
         <primary>OpenSSL</primary>
diff --git a/doc/src/sgml/ref/alter_collation.sgml b/doc/src/sgml/ref/alter_collation.sgml
index 6708c7e10e..3bca3f1396 100644
--- a/doc/src/sgml/ref/alter_collation.sgml
+++ b/doc/src/sgml/ref/alter_collation.sgml
@@ -21,6 +21,8 @@
 
  <refsynopsisdiv>
 <synopsis>
+ALTER COLLATION <replaceable>name</replaceable> REFRESH VERSION
+
 ALTER COLLATION <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
 ALTER COLLATION <replaceable>name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
 ALTER COLLATION <replaceable>name</replaceable> SET SCHEMA <replaceable>new_schema</replaceable>
@@ -85,9 +87,49 @@ <title>Parameters</title>
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term><literal>REFRESH VERSION</literal></term>
+    <listitem>
+     <para>
+      Updated the collation version.
+      See <xref linkend="sql-altercollation-notes"> below.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </refsect1>
 
+ <refsect1 id="sql-altercollation-notes">
+  <title>Notes</title>
+
+  <para>
+   When using collations provided by the ICU library, the ICU-specific version
+   of the collator is recorded in the system catalog when the collation object
+   is created.  When the collation is then used, the current version is
+   checked against the recorded version, and a warning is issued when there is
+   a mismatch, for example:
+<screen>
+WARNING:  ICU collator version mismatch
+DETAIL:  The database was created using version 1.2.3.4, the library provides version 2.3.4.5.
+HINT:  Rebuild all objects affected by this collation and run ALTER COLLATION pg_catalog."xx%icu" REFRESH VERSION, or build PostgreSQL with the right version of ICU.
+</screen>
+   A change in collation definitions can lead to corrupt indexes and other
+   problems where the database system relies on stored objects having a
+   certain sort order.  Generally, this should be avoided, but it can happen
+   in legitimate circumstances, such as when
+   using <command>pg_upgrade</command> to upgrade to server binaries linked
+   with a newer version of ICU.  When this happens, all objects depending on
+   the collation should be rebuilt, for example,
+   using <command>REINDEX</command>.  When that is done, the collation version
+   can be refreshed using the command <literal>ALTER COLLATION ... REFRESH
+   VERSION</literal>.  This will update the system catalog to record the
+   current collator version and will make the warning go away.  Note that this
+   does not actually check whether all affected objects have been rebuilt
+   correctly.
+  </para>
+ </refsect1>
+
  <refsect1>
   <title>Examples</title>
 
diff --git a/doc/src/sgml/ref/create_collation.sgml b/doc/src/sgml/ref/create_collation.sgml
index c09e5bd6d4..ee7a8ea5ae 100644
--- a/doc/src/sgml/ref/create_collation.sgml
+++ b/doc/src/sgml/ref/create_collation.sgml
@@ -21,7 +21,8 @@
 CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> (
     [ LOCALE = <replaceable>locale</replaceable>, ]
     [ LC_COLLATE = <replaceable>lc_collate</replaceable>, ]
-    [ LC_CTYPE = <replaceable>lc_ctype</replaceable> ]
+    [ LC_CTYPE = <replaceable>lc_ctype</replaceable>, ]
+    [ PROVIDER = <replaceable>provider</replaceable> ]
 )
 CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replaceable>existing_collation</replaceable>
 </synopsis>
@@ -114,6 +115,19 @@ <title>Parameters</title>
     </varlistentry>
 
     <varlistentry>
+     <term><replaceable>provider</replaceable></term>
+
+     <listitem>
+      <para>
+       Specifies the provider to use for locale services associated with this
+       collation.  Possible values
+       are: <literal>icu</literal>,<indexterm><primary>ICU</></> <literal>libc</literal>.
+       The available choices depend on the operating system and build options.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
      <term><replaceable>existing_collation</replaceable></term>
 
      <listitem>
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 59bd7996d1..218d06c2b3 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -179,6 +179,7 @@ pgxsdir = $(pkglibdir)/pgxs
 #
 # Records the choice of the various --enable-xxx and --with-xxx options.
 
+with_icu	= @with_icu@
 with_perl	= @with_perl@
 with_python	= @with_python@
 with_tcl	= @with_tcl@
@@ -208,6 +209,9 @@ python_version		= @python_version@
 
 krb_srvtab = @krb_srvtab@
 
+ICU_CFLAGS		= @ICU_CFLAGS@
+ICU_LIBS		= @ICU_LIBS@
+
 TCLSH			= @TCLSH@
 TCL_LIBS		= @TCL_LIBS@
 TCL_LIB_SPEC		= @TCL_LIB_SPEC@
diff --git a/src/backend/Makefile b/src/backend/Makefile
index 7a0bbb2942..fffb0d95ba 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -58,7 +58,7 @@ ifneq ($(PORTNAME), win32)
 ifneq ($(PORTNAME), aix)
 
 postgres: $(OBJS)
-	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) -o $@
+	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) $(ICU_LIBS) -o $@
 
 endif
 endif
diff --git a/src/backend/catalog/pg_collation.c b/src/backend/catalog/pg_collation.c
index 65b6051c0d..ffe68aa114 100644
--- a/src/backend/catalog/pg_collation.c
+++ b/src/backend/catalog/pg_collation.c
@@ -27,6 +27,7 @@
 #include "mb/pg_wchar.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
+#include "utils/pg_locale.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
@@ -40,10 +41,12 @@
 Oid
 CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
 				const char *collcollate, const char *collctype,
 				bool if_not_exists)
 {
+	char	   *collversion;
 	Relation	rel;
 	TupleDesc	tupDesc;
 	HeapTuple	tup;
@@ -78,21 +81,27 @@ CollationCreate(const char *collname, Oid collnamespace,
 		{
 			ereport(NOTICE,
 				(errcode(ERRCODE_DUPLICATE_OBJECT),
-				 errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping",
-						collname, pg_encoding_to_char(collencoding))));
+				 collencoding == -1
+				 ? errmsg("collation \"%s\" already exists, skipping",
+						  collname)
+				 : errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping",
+						  collname, pg_encoding_to_char(collencoding))));
 			return InvalidOid;
 		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_DUPLICATE_OBJECT),
-					 errmsg("collation \"%s\" for encoding \"%s\" already exists",
-							collname, pg_encoding_to_char(collencoding))));
+					 collencoding == -1
+					 ? errmsg("collation \"%s\" already exists",
+							  collname)
+					 : errmsg("collation \"%s\" for encoding \"%s\" already exists",
+							  collname, pg_encoding_to_char(collencoding))));
 	}
 
 	/*
 	 * Also forbid matching an any-encoding entry.  This test of course is not
 	 * backed up by the unique index, but it's not a problem since we don't
-	 * support adding any-encoding entries after initdb.
+	 * support adding any-encoding entries after initdb. FIXME
 	 */
 	if (SearchSysCacheExists3(COLLNAMEENCNSP,
 							  PointerGetDatum(collname),
@@ -114,6 +123,8 @@ CollationCreate(const char *collname, Oid collnamespace,
 						collname)));
 	}
 
+	collversion = get_system_collation_version(collprovider, collcollate);
+
 	/* open pg_collation */
 	rel = heap_open(CollationRelationId, RowExclusiveLock);
 	tupDesc = RelationGetDescr(rel);
@@ -125,11 +136,16 @@ CollationCreate(const char *collname, Oid collnamespace,
 	values[Anum_pg_collation_collname - 1] = NameGetDatum(&name_name);
 	values[Anum_pg_collation_collnamespace - 1] = ObjectIdGetDatum(collnamespace);
 	values[Anum_pg_collation_collowner - 1] = ObjectIdGetDatum(collowner);
+	values[Anum_pg_collation_collprovider - 1] = CharGetDatum(collprovider);
 	values[Anum_pg_collation_collencoding - 1] = Int32GetDatum(collencoding);
 	namestrcpy(&name_collate, collcollate);
 	values[Anum_pg_collation_collcollate - 1] = NameGetDatum(&name_collate);
 	namestrcpy(&name_ctype, collctype);
 	values[Anum_pg_collation_collctype - 1] = NameGetDatum(&name_ctype);
+	if (collversion)
+		values[Anum_pg_collation_collversion - 1] = CStringGetTextDatum(collversion);
+	else
+		nulls[Anum_pg_collation_collversion - 1] = true;
 
 	tup = heap_form_tuple(tupDesc, values, nulls);
 
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 919cfc6a06..5f9f142de8 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -14,15 +14,18 @@
  */
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/dependency.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/objectaccess.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_collation_fn.h"
 #include "commands/alter.h"
 #include "commands/collationcmds.h"
+#include "commands/comment.h"
 #include "commands/dbcommands.h"
 #include "commands/defrem.h"
 #include "mb/pg_wchar.h"
@@ -33,6 +36,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
+
 /*
  * CREATE COLLATION
  */
@@ -47,8 +51,12 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 	DefElem    *localeEl = NULL;
 	DefElem    *lccollateEl = NULL;
 	DefElem    *lcctypeEl = NULL;
+	DefElem    *providerEl = NULL;
 	char	   *collcollate = NULL;
 	char	   *collctype = NULL;
+	char	   *collproviderstr = NULL;
+	int			collencoding;
+	char		collprovider = 0;
 	Oid			newoid;
 	ObjectAddress address;
 
@@ -72,6 +80,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 			defelp = &lccollateEl;
 		else if (pg_strcasecmp(defel->defname, "lc_ctype") == 0)
 			defelp = &lcctypeEl;
+		else if (pg_strcasecmp(defel->defname, "provider") == 0)
+			defelp = &providerEl;
 		else
 		{
 			ereport(ERROR,
@@ -103,6 +113,7 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 
 		collcollate = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collcollate));
 		collctype = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collctype));
+		collprovider = ((Form_pg_collation) GETSTRUCT(tp))->collprovider;
 
 		ReleaseSysCache(tp);
 	}
@@ -119,6 +130,24 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 	if (lcctypeEl)
 		collctype = defGetString(lcctypeEl);
 
+	if (providerEl)
+		collproviderstr = defGetString(providerEl);
+
+	if (collproviderstr)
+	{
+		if (pg_strcasecmp(collproviderstr, "icu") == 0)
+			collprovider = COLLPROVIDER_ICU;
+		else if (pg_strcasecmp(collproviderstr, "libc") == 0)
+			collprovider = COLLPROVIDER_LIBC;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+					 errmsg("unrecognized collation provider: %s",
+							collproviderstr)));
+	}
+	else if (!fromEl)
+		collprovider = COLLPROVIDER_LIBC;
+
 	if (!collcollate)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
@@ -129,12 +158,19 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
 				 errmsg("parameter \"lc_ctype\" must be specified")));
 
-	check_encoding_locale_matches(GetDatabaseEncoding(), collcollate, collctype);
+	if (collprovider == COLLPROVIDER_ICU)
+		collencoding = -1;
+	else
+	{
+		collencoding = GetDatabaseEncoding();
+		check_encoding_locale_matches(collencoding, collcollate, collctype);
+	}
 
 	newoid = CollationCreate(collName,
 							 collNamespace,
 							 GetUserId(),
-							 GetDatabaseEncoding(),
+							 collprovider,
+							 collencoding,
 							 collcollate,
 							 collctype,
 							 if_not_exists);
@@ -182,6 +218,79 @@ IsThereCollationInNamespace(const char *collname, Oid nspOid)
 						collname, get_namespace_name(nspOid))));
 }
 
+/*
+ * ALTER COLLATION
+ */
+ObjectAddress
+AlterCollation(AlterCollationStmt *stmt)
+{
+	Relation	rel;
+	Oid			collOid;
+	HeapTuple	tup;
+	Form_pg_collation collForm;
+	Datum		collversion;
+	bool		isnull;
+	char	   *oldversion;
+	char	   *newversion;
+	ObjectAddress address;
+
+	rel = heap_open(CollationRelationId, RowExclusiveLock);
+	collOid = get_collation_oid(stmt->collname, false);
+
+	if (!pg_collation_ownercheck(collOid, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_COLLATION,
+					   NameListToString(stmt->collname));
+
+	tup = SearchSysCacheCopy1(COLLOID, ObjectIdGetDatum(collOid));
+	if (!HeapTupleIsValid(tup))
+		elog(ERROR, "cache lookup failed for collation %u", collOid);
+
+	collForm = (Form_pg_collation) GETSTRUCT(tup);
+	collversion = SysCacheGetAttr(COLLOID, tup, Anum_pg_collation_collversion,
+								  &isnull);
+	oldversion = isnull ? NULL : TextDatumGetCString(collversion);
+
+	newversion = get_system_collation_version(collForm->collprovider, NameStr(collForm->collcollate));
+
+	/* cannot change from NULL to non-NULL or vice versa */
+	if ((!oldversion && newversion) || (oldversion && !newversion))
+		elog(ERROR, "invalid collation version change");
+	else if (oldversion && newversion && strcmp(newversion, oldversion) != 0)
+	{
+		bool        nulls[Natts_pg_collation];
+		bool        replaces[Natts_pg_collation];
+		Datum       values[Natts_pg_collation];
+
+		ereport(NOTICE,
+				(errmsg("changing version from %s to %s",
+						oldversion, newversion)));
+
+		memset(values, 0, sizeof(values));
+		memset(nulls, false, sizeof(nulls));
+		memset(replaces, false, sizeof(replaces));
+
+		values[Anum_pg_collation_collversion - 1] = CStringGetTextDatum(newversion);
+		replaces[Anum_pg_collation_collversion - 1] = true;
+
+		tup = heap_modify_tuple(tup, RelationGetDescr(rel),
+								values, nulls, replaces);
+	}
+	else
+		ereport(NOTICE,
+				(errmsg("version has not changed")));
+
+	CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+	InvokeObjectPostAlterHook(CollationRelationId, collOid, 0);
+
+	ObjectAddressSet(address, CollationRelationId, collOid);
+
+	heap_freetuple(tup);
+	heap_close(rel, NoLock);
+
+	return address;
+}
+
 
 /*
  * "Normalize" a locale name, stripping off encoding tags such as
@@ -219,6 +328,29 @@ normalize_locale_name(char *new, const char *old)
 }
 
 
+#ifdef USE_ICU
+static char *
+get_icu_locale_comment(const char *localename)
+{
+	UErrorCode	status;
+	UChar		displayname[128];
+	int32		len_uchar;
+	char	   *result;
+
+	status = U_ZERO_ERROR;
+	len_uchar = uloc_getDisplayName(localename, "en", &displayname[0], sizeof(displayname), &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("could get display name for locale \"%s\": %s",
+						localename, u_errorName(status))));
+
+	icu_from_uchar(&result, displayname, len_uchar);
+
+	return result;
+}
+#endif	/* USE_ICU */
+
+
 Datum
 pg_import_system_collations(PG_FUNCTION_ARGS)
 {
@@ -302,7 +434,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 
 		count++;
 
-		CollationCreate(localebuf, nspid, GetUserId(), enc,
+		CollationCreate(localebuf, nspid, GetUserId(), COLLPROVIDER_LIBC, enc,
 						localebuf, localebuf, if_not_exists);
 
 		CommandCounterIncrement();
@@ -333,7 +465,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 		char	   *locale = (char *) lfirst(lcl);
 		int			enc = lfirst_int(lce);
 
-		CollationCreate(alias, nspid, GetUserId(), enc,
+		CollationCreate(alias, nspid, GetUserId(), COLLPROVIDER_LIBC, enc,
 						locale, locale, true);
 		CommandCounterIncrement();
 	}
@@ -343,5 +475,70 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 				(errmsg("no usable system locales were found")));
 #endif   /* not HAVE_LOCALE_T && not WIN32 */
 
+#ifdef USE_ICU
+	if (!is_encoding_supported_by_icu(GetDatabaseEncoding()))
+	{
+		ereport(NOTICE,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("encoding \"%s\" not supported by ICU",
+						pg_encoding_to_char(GetDatabaseEncoding()))));
+	}
+	else
+	{
+		int i;
+
+		/*
+		 * Start the loop at -1 to sneak in the root locale without too much
+		 * code duplication.
+		 */
+		for (i = -1; i < ucol_countAvailable(); i++)
+		{
+			const char *name;
+			UEnumeration *en;
+			UErrorCode	status;
+			const char *val;
+			Oid			collid;
+
+			if (i == -1)
+				name = "";  /* ICU root locale */
+			else
+				name = ucol_getAvailable(i);
+			collid = CollationCreate(psprintf("%s%%icu", i == -1 ? "root" : name),
+									 nspid, GetUserId(), COLLPROVIDER_ICU, -1,
+									 name, name, if_not_exists);
+
+			CreateComments(collid, CollationRelationId, 0,
+						   get_icu_locale_comment(name));
+
+			/*
+			 * Add keyword variants
+			 */
+			status = U_ZERO_ERROR;
+			en = ucol_getKeywordValuesForLocale("collation", name, TRUE, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not get keyword values for locale \"%s\": %s",
+								name, u_errorName(status))));
+
+			status = U_ZERO_ERROR;
+			uenum_reset(en, &status);
+			while ((val = uenum_next(en, NULL, &status)))
+			{
+				char *localename = psprintf("%s@collation=%s", name, val);
+				collid = CollationCreate(psprintf("%s_%s%%icu", i == -1 ? "root" : name, val),
+										 nspid, GetUserId(), COLLPROVIDER_ICU, -1,
+										 localename, localename, if_not_exists);
+				CreateComments(collid, CollationRelationId, 0,
+							   get_icu_locale_comment(localename));
+			}
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not get keyword values for locale \"%s\": %s",
+								name, u_errorName(status))));
+			uenum_close(en);
+		}
+	}
+#endif
+
 	PG_RETURN_VOID();
 }
diff --git a/src/backend/common.mk b/src/backend/common.mk
index 5d599dbd0c..0b57543bc4 100644
--- a/src/backend/common.mk
+++ b/src/backend/common.mk
@@ -8,6 +8,8 @@
 # this directory and SUBDIRS to subdirectories containing more things
 # to build.
 
+override CPPFLAGS := $(CPPFLAGS) $(ICU_CFLAGS)
+
 ifdef PARTIAL_LINKING
 # old style: linking using SUBSYS.o
 subsysfilename = SUBSYS.o
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 05d8538717..2a8524a4ff 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2922,6 +2922,16 @@ _copyAlterTableCmd(const AlterTableCmd *from)
 	return newnode;
 }
 
+static AlterCollationStmt *
+_copyAlterCollationStmt(const AlterCollationStmt *from)
+{
+	AlterCollationStmt *newnode = makeNode(AlterCollationStmt);
+
+	COPY_NODE_FIELD(collname);
+
+	return newnode;
+}
+
 static AlterDomainStmt *
 _copyAlterDomainStmt(const AlterDomainStmt *from)
 {
@@ -4854,6 +4864,9 @@ copyObject(const void *from)
 		case T_AlterTableCmd:
 			retval = _copyAlterTableCmd(from);
 			break;
+		case T_AlterCollationStmt:
+			retval = _copyAlterCollationStmt(from);
+			break;
 		case T_AlterDomainStmt:
 			retval = _copyAlterDomainStmt(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index d595cd7481..0ee88c7435 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1067,6 +1067,14 @@ _equalAlterTableCmd(const AlterTableCmd *a, const AlterTableCmd *b)
 }
 
 static bool
+_equalAlterCollationStmt(const AlterCollationStmt *a, const AlterCollationStmt *b)
+{
+	COMPARE_NODE_FIELD(collname);
+
+	return true;
+}
+
+static bool
 _equalAlterDomainStmt(const AlterDomainStmt *a, const AlterDomainStmt *b)
 {
 	COMPARE_SCALAR_FIELD(subtype);
@@ -3112,6 +3120,9 @@ equal(const void *a, const void *b)
 		case T_AlterTableCmd:
 			retval = _equalAlterTableCmd(a, b);
 			break;
+		case T_AlterCollationStmt:
+			retval = _equalAlterCollationStmt(a, b);
+			break;
 		case T_AlterDomainStmt:
 			retval = _equalAlterDomainStmt(a, b);
 			break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 5cb82977d5..c8be4be120 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -244,7 +244,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 }
 
 %type <node>	stmt schema_stmt
-		AlterEventTrigStmt
+		AlterEventTrigStmt AlterCollationStmt
 		AlterDatabaseStmt AlterDatabaseSetStmt AlterDomainStmt AlterEnumStmt
 		AlterFdwStmt AlterForeignServerStmt AlterGroupStmt
 		AlterObjectDependsStmt AlterObjectSchemaStmt AlterOwnerStmt
@@ -806,6 +806,7 @@ stmtmulti:	stmtmulti ';' stmt
 
 stmt :
 			AlterEventTrigStmt
+			| AlterCollationStmt
 			| AlterDatabaseStmt
 			| AlterDatabaseSetStmt
 			| AlterDefaultPrivilegesStmt
@@ -9736,6 +9737,21 @@ DropdbStmt: DROP DATABASE database_name
 
 /*****************************************************************************
  *
+ *		ALTER COLLATION
+ *
+ *****************************************************************************/
+
+AlterCollationStmt: ALTER COLLATION any_name REFRESH VERSION_P
+				{
+					AlterCollationStmt *n = makeNode(AlterCollationStmt);
+					n->collname = $3;
+					$$ = (Node *)n;
+				}
+		;
+
+
+/*****************************************************************************
+ *
  *		ALTER SYSTEM
  *
  * This is used to change configuration parameters persistently.
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index afa3a7d613..55c8ab12d6 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -68,7 +68,8 @@ typedef enum
 	PG_REGEX_LOCALE_WIDE,		/* Use <wctype.h> functions */
 	PG_REGEX_LOCALE_1BYTE,		/* Use <ctype.h> functions */
 	PG_REGEX_LOCALE_WIDE_L,		/* Use locale_t <wctype.h> functions */
-	PG_REGEX_LOCALE_1BYTE_L		/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_1BYTE_L,	/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_ICU			/* Use ICU uchar.h functions */
 } PG_Locale_Strategy;
 
 static PG_Locale_Strategy pg_regex_strategy;
@@ -262,6 +263,11 @@ pg_set_regex_collation(Oid collation)
 					 errhint("Use the COLLATE clause to set the collation explicitly.")));
 		}
 
+#ifdef USE_ICU
+		if (pg_regex_locale && pg_regex_locale->provider == COLLPROVIDER_ICU)
+			pg_regex_strategy = PG_REGEX_LOCALE_ICU;
+		else
+#endif
 #ifdef USE_WIDE_UPPER_LOWER
 		if (GetDatabaseEncoding() == PG_UTF8)
 		{
@@ -303,13 +309,18 @@ pg_wc_isdigit(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswdigit_l((wint_t) c, pg_regex_locale);
+				return iswdigit_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isdigit_l((unsigned char) c, pg_regex_locale));
+					isdigit_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isdigit(c);
 #endif
 			break;
 	}
@@ -336,13 +347,18 @@ pg_wc_isalpha(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalpha_l((wint_t) c, pg_regex_locale);
+				return iswalpha_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalpha_l((unsigned char) c, pg_regex_locale));
+					isalpha_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalpha(c);
 #endif
 			break;
 	}
@@ -369,13 +385,18 @@ pg_wc_isalnum(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalnum_l((wint_t) c, pg_regex_locale);
+				return iswalnum_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalnum_l((unsigned char) c, pg_regex_locale));
+					isalnum_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalnum(c);
 #endif
 			break;
 	}
@@ -402,13 +423,18 @@ pg_wc_isupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswupper_l((wint_t) c, pg_regex_locale);
+				return iswupper_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isupper_l((unsigned char) c, pg_regex_locale));
+					isupper_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isupper(c);
 #endif
 			break;
 	}
@@ -435,13 +461,18 @@ pg_wc_islower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswlower_l((wint_t) c, pg_regex_locale);
+				return iswlower_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					islower_l((unsigned char) c, pg_regex_locale));
+					islower_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_islower(c);
 #endif
 			break;
 	}
@@ -468,13 +499,18 @@ pg_wc_isgraph(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswgraph_l((wint_t) c, pg_regex_locale);
+				return iswgraph_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isgraph_l((unsigned char) c, pg_regex_locale));
+					isgraph_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isgraph(c);
 #endif
 			break;
 	}
@@ -501,13 +537,18 @@ pg_wc_isprint(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswprint_l((wint_t) c, pg_regex_locale);
+				return iswprint_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isprint_l((unsigned char) c, pg_regex_locale));
+					isprint_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isprint(c);
 #endif
 			break;
 	}
@@ -534,13 +575,18 @@ pg_wc_ispunct(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswpunct_l((wint_t) c, pg_regex_locale);
+				return iswpunct_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					ispunct_l((unsigned char) c, pg_regex_locale));
+					ispunct_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_ispunct(c);
 #endif
 			break;
 	}
@@ -567,13 +613,18 @@ pg_wc_isspace(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswspace_l((wint_t) c, pg_regex_locale);
+				return iswspace_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isspace_l((unsigned char) c, pg_regex_locale));
+					isspace_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isspace(c);
 #endif
 			break;
 	}
@@ -608,15 +659,20 @@ pg_wc_toupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towupper_l((wint_t) c, pg_regex_locale);
+				return towupper_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return toupper_l((unsigned char) c, pg_regex_locale);
+				return toupper_l((unsigned char) c, pg_regex_locale->info.lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_toupper(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -649,15 +705,20 @@ pg_wc_tolower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towlower_l((wint_t) c, pg_regex_locale);
+				return towlower_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return tolower_l((unsigned char) c, pg_regex_locale);
+				return tolower_l((unsigned char) c, pg_regex_locale->info.lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_tolower(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -808,6 +869,9 @@ pg_ctype_get_cache(pg_wc_probefunc probefunc, int cclasscode)
 			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
 #endif
 			break;
+		case PG_REGEX_LOCALE_ICU:
+			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
+			break;
 		default:
 			max_chr = 0;		/* can't get here, but keep compiler quiet */
 			break;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 3bc0ae5e7e..2a70985906 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1622,6 +1622,10 @@ ProcessUtilitySlow(ParseState *pstate,
 				commandCollected = true;
 				break;
 
+			case T_AlterCollationStmt:
+				address = AlterCollation((AlterCollationStmt *) parsetree);
+				break;
+
 			default:
 				elog(ERROR, "unrecognized node type: %d",
 					 (int) nodeTag(parsetree));
@@ -2672,6 +2676,10 @@ CreateCommandTag(Node *parsetree)
 			tag = "DROP SUBSCRIPTION";
 			break;
 
+		case T_AlterCollationStmt:
+			tag = "ALTER COLLATION";
+			break;
+
 		case T_PrepareStmt:
 			tag = "PREPARE";
 			break;
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 4f3d8a1189..306abdcffc 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -82,6 +82,10 @@
 #include <wctype.h>
 #endif
 
+#ifdef USE_ICU
+#include <unicode/ustring.h>
+#endif
+
 #include "catalog/pg_collation.h"
 #include "mb/pg_wchar.h"
 #include "utils/builtins.h"
@@ -1443,6 +1447,42 @@ str_numth(char *dest, char *num, int type)
  *			upper/lower/initcap functions
  *****************************************************************************/
 
+#ifdef USE_ICU
+static int32_t
+icu_convert_case(int32_t (*func)(UChar *, int32_t, const UChar *, int32_t, const char *, UErrorCode *),
+				 pg_locale_t mylocale, UChar **buff_dest, UChar *buff_source, int32_t len_source)
+{
+	UErrorCode	status;
+	int32_t		len_dest;
+
+	len_dest = len_source;  /* try first with same length */
+	*buff_dest = palloc(len_dest * sizeof(**buff_dest));
+	status = U_ZERO_ERROR;
+	len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->info.icu.locale, &status);
+	if (status == U_BUFFER_OVERFLOW_ERROR)
+	{
+		/* try again with adjusted length */
+		pfree(buff_dest);
+		buff_dest = palloc(len_dest * sizeof(**buff_dest));
+		status = U_ZERO_ERROR;
+		len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->info.icu.locale, &status);
+	}
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("case conversion failed: %s", u_errorName(status))));
+	return len_dest;
+}
+
+static int32_t
+u_strToTitle_default_BI(UChar *dest, int32_t destCapacity,
+						const UChar *src, int32_t srcLength,
+						const char *locale,
+						UErrorCode *pErrorCode)
+{
+	return u_strToTitle(dest, destCapacity, src, srcLength, NULL, locale, pErrorCode);
+}
+#endif
+
 /*
  * If the system provides the needed functions for wide-character manipulation
  * (which are all standardized by C99), then we implement upper/lower/initcap
@@ -1479,12 +1519,9 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 		result = asc_tolower(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1502,77 +1539,79 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar;
+			int32_t		len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToLower, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-			else
+					if (mylocale)
+						workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->info.lt);
+					else
 #endif
-				workspace[curr_char] = towlower(workspace[curr_char]);
-		}
+						workspace[curr_char] = towlower(workspace[curr_char]);
+				}
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
+			}
 #endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
-
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
+			else
 			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for lower() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that tolower_l() will not be so broken as to need
-		 * an isupper_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that tolower_l() will not be so broken as to need
+				 * an isupper_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = tolower_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = tolower_l((unsigned char) *p, mylocale->info.lt);
+					else
 #endif
-				*p = pg_tolower((unsigned char) *p);
+						*p = pg_tolower((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1599,12 +1638,9 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 		result = asc_toupper(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1622,77 +1658,78 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToUpper, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-			else
-#endif
-				workspace[curr_char] = towupper(workspace[curr_char]);
-		}
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
+					if (mylocale)
+						workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->info.lt);
+					else
 #endif
-		char	   *p;
+						workspace[curr_char] = towupper(workspace[curr_char]);
+				}
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for upper() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
+
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+#endif   /* USE_WIDE_UPPER_LOWER */
+			else
+			{
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that toupper_l() will not be so broken as to need
-		 * an islower_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that toupper_l() will not be so broken as to need
+				 * an islower_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = toupper_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = toupper_l((unsigned char) *p, mylocale->info.lt);
+					else
 #endif
-				*p = pg_toupper((unsigned char) *p);
+						*p = pg_toupper((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1720,12 +1757,9 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 		result = asc_initcap(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1743,100 +1777,101 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
-
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
-
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
-
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
 		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-				else
-					workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-				wasalnum = iswalnum_l(workspace[curr_char], mylocale);
-			}
-			else
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
 #endif
+		{
+			if (pg_database_encoding_max_length() > 1)
 			{
-				if (wasalnum)
-					workspace[curr_char] = towlower(workspace[curr_char]);
-				else
-					workspace[curr_char] = towupper(workspace[curr_char]);
-				wasalnum = iswalnum(workspace[curr_char]);
-			}
-		}
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for initcap() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
+					if (mylocale)
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->info.lt);
+						else
+							workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->info.lt);
+						wasalnum = iswalnum_l(workspace[curr_char], mylocale->info.lt);
+					}
+					else
 #endif
-		}
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower(workspace[curr_char]);
+						else
+							workspace[curr_char] = towupper(workspace[curr_char]);
+						wasalnum = iswalnum(workspace[curr_char]);
+					}
+				}
 
-		result = pnstrdup(buff, nbytes);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		/*
-		 * Note: we assume that toupper_l()/tolower_l() will not be so broken
-		 * as to need guard tests.  When using the default collation, we apply
-		 * the traditional Postgres behavior that forces ASCII-style treatment
-		 * of I/i, but in non-default collations you get exactly what the
-		 * collation says.
-		 */
-		for (p = result; *p; p++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					*p = tolower_l((unsigned char) *p, mylocale);
-				else
-					*p = toupper_l((unsigned char) *p, mylocale);
-				wasalnum = isalnum_l((unsigned char) *p, mylocale);
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
+#endif   /* USE_WIDE_UPPER_LOWER */
 			else
-#endif
 			{
-				if (wasalnum)
-					*p = pg_tolower((unsigned char) *p);
-				else
-					*p = pg_toupper((unsigned char) *p);
-				wasalnum = isalnum((unsigned char) *p);
+				char	   *p;
+
+				result = pnstrdup(buff, nbytes);
+
+				/*
+				 * Note: we assume that toupper_l()/tolower_l() will not be so broken
+				 * as to need guard tests.  When using the default collation, we apply
+				 * the traditional Postgres behavior that forces ASCII-style treatment
+				 * of I/i, but in non-default collations you get exactly what the
+				 * collation says.
+				 */
+				for (p = result; *p; p++)
+				{
+#ifdef HAVE_LOCALE_T
+					if (mylocale)
+					{
+						if (wasalnum)
+							*p = tolower_l((unsigned char) *p, mylocale->info.lt);
+						else
+							*p = toupper_l((unsigned char) *p, mylocale->info.lt);
+						wasalnum = isalnum_l((unsigned char) *p, mylocale->info.lt);
+					}
+					else
+#endif
+					{
+						if (wasalnum)
+							*p = pg_tolower((unsigned char) *p);
+						else
+							*p = pg_toupper((unsigned char) *p);
+						wasalnum = isalnum((unsigned char) *p);
+					}
+				}
 			}
 		}
 	}
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index 91fe109867..8a4f768de3 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -96,7 +96,7 @@ SB_lower_char(unsigned char c, pg_locale_t locale, bool locale_is_c)
 		return pg_ascii_tolower(c);
 #ifdef HAVE_LOCALE_T
 	else if (locale)
-		return tolower_l(c, locale);
+		return tolower_l(c, locale->info.lt);
 #endif
 	else
 		return pg_tolower(c);
@@ -165,14 +165,36 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 			   *p;
 	int			slen,
 				plen;
+	pg_locale_t locale = 0;
+	bool		locale_is_c = false;
+
+	if (lc_ctype_is_c(collation))
+		locale_is_c = true;
+	else if (collation != DEFAULT_COLLATION_OID)
+	{
+		if (!OidIsValid(collation))
+		{
+			/*
+			 * This typically means that the parser could not resolve a
+			 * conflict of implicit collations, so report it that way.
+			 */
+			ereport(ERROR,
+					(errcode(ERRCODE_INDETERMINATE_COLLATION),
+					 errmsg("could not determine which collation to use for ILIKE"),
+					 errhint("Use the COLLATE clause to set the collation explicitly.")));
+		}
+		locale = pg_newlocale_from_collation(collation);
+	}
 
 	/*
 	 * For efficiency reasons, in the single byte case we don't call lower()
 	 * on the pattern and text, but instead call SB_lower_char on each
-	 * character.  In the multi-byte case we don't have much choice :-(
+	 * character.  In the multi-byte case we don't have much choice :-(.
+	 * Also, ICU does not support single-character case folding, so we go the
+	 * long way.
 	 */
 
-	if (pg_database_encoding_max_length() > 1)
+	if (pg_database_encoding_max_length() > 1 || locale->provider == COLLPROVIDER_ICU)
 	{
 		/* lower's result is never packed, so OK to use old macros here */
 		pat = DatumGetTextP(DirectFunctionCall1Coll(lower, collation,
@@ -190,31 +212,6 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 	}
 	else
 	{
-		/*
-		 * Here we need to prepare locale information for SB_lower_char. This
-		 * should match the methods used in str_tolower().
-		 */
-		pg_locale_t locale = 0;
-		bool		locale_is_c = false;
-
-		if (lc_ctype_is_c(collation))
-			locale_is_c = true;
-		else if (collation != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collation))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for ILIKE"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-			locale = pg_newlocale_from_collation(collation);
-		}
-
 		p = VARDATA_ANY(pat);
 		plen = VARSIZE_ANY_EXHDR(pat);
 		s = VARDATA_ANY(str);
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 4c7f1dad50..3e2c4405bf 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -58,11 +58,17 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_control.h"
 #include "mb/pg_wchar.h"
+#include "utils/builtins.h"
 #include "utils/hsearch.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
 #include "utils/syscache.h"
 
+#ifdef USE_ICU
+#include <unicode/ucnv.h>
+#endif
+
 #ifdef WIN32
 /*
  * This Windows file defines StrNCpy. We don't need it here, so we undefine
@@ -1273,12 +1279,13 @@ pg_newlocale_from_collation(Oid collid)
 	if (cache_entry->locale == 0)
 	{
 		/* We haven't computed this yet in this session, so do it */
-#ifdef HAVE_LOCALE_T
 		HeapTuple	tp;
 		Form_pg_collation collform;
 		const char *collcollate;
-		const char *collctype;
-		locale_t	result;
+		const char *collctype pg_attribute_unused();
+		pg_locale_t	result;
+		Datum		collversion;
+		bool		isnull;
 
 		tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collid));
 		if (!HeapTupleIsValid(tp))
@@ -1288,61 +1295,223 @@ pg_newlocale_from_collation(Oid collid)
 		collcollate = NameStr(collform->collcollate);
 		collctype = NameStr(collform->collctype);
 
-		if (strcmp(collcollate, collctype) == 0)
+		result = malloc(sizeof(* result));
+		memset(result, 0, sizeof(* result));
+		result->provider = collform->collprovider;
+
+		if (collform->collprovider == COLLPROVIDER_LIBC)
 		{
-			/* Normal case where they're the same */
+#ifdef HAVE_LOCALE_T
+			locale_t	loc;
+
+			if (strcmp(collcollate, collctype) == 0)
+			{
+				/* Normal case where they're the same */
 #ifndef WIN32
-			result = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
-							   NULL);
+				loc = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
+								   NULL);
 #else
-			result = _create_locale(LC_ALL, collcollate);
+				loc = _create_locale(LC_ALL, collcollate);
 #endif
-			if (!result)
-				report_newlocale_failure(collcollate);
-		}
-		else
-		{
+				if (!loc)
+					report_newlocale_failure(collcollate);
+			}
+			else
+			{
 #ifndef WIN32
-			/* We need two newlocale() steps */
-			locale_t	loc1;
-
-			loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
-			if (!loc1)
-				report_newlocale_failure(collcollate);
-			result = newlocale(LC_CTYPE_MASK, collctype, loc1);
-			if (!result)
-				report_newlocale_failure(collctype);
+				/* We need two newlocale() steps */
+				locale_t	loc1;
+
+				loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
+				if (!loc1)
+					report_newlocale_failure(collcollate);
+				loc = newlocale(LC_CTYPE_MASK, collctype, loc1);
+				if (!loc)
+					report_newlocale_failure(collctype);
 #else
 
-			/*
-			 * XXX The _create_locale() API doesn't appear to support this.
-			 * Could perhaps be worked around by changing pg_locale_t to
-			 * contain two separate fields.
-			 */
+				/*
+				 * XXX The _create_locale() API doesn't appear to support this.
+				 * Could perhaps be worked around by changing pg_locale_t to
+				 * contain two separate fields.
+				 */
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("collations with different collate and ctype values are not supported on this platform")));
+#endif
+			}
+
+			result->info.lt = loc;
+#else							/* not HAVE_LOCALE_T */
+			/* platform that doesn't support locale_t */
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					 errmsg("collations with different collate and ctype values are not supported on this platform")));
-#endif
+					 errmsg("collation provider LIBC is not supported on this platform")));
+#endif   /* not HAVE_LOCALE_T */
+		}
+		else if (collform->collprovider == COLLPROVIDER_ICU)
+		{
+#ifdef USE_ICU
+			UCollator  *collator;
+			UErrorCode	status;
+
+			status = U_ZERO_ERROR;
+			collator = ucol_open(collcollate, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not open collator for locale \"%s\": %s",
+								collcollate, u_errorName(status))));
+
+			result->info.icu.locale = strdup(collcollate);
+			result->info.icu.ucol = collator;
+#else /* not USE_ICU */
+			/* could get here if a collation was created by a build with ICU */
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("ICU is not supported in this build"), \
+					 errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif /* not USE_ICU */
 		}
 
-		cache_entry->locale = result;
+		collversion = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collversion,
+									  &isnull);
+		if (!isnull)
+		{
+			char	   *sysversionstr;
+			char	   *collversionstr;
+
+			sysversionstr = get_system_collation_version(collform->collprovider, collcollate);
+			collversionstr = TextDatumGetCString(collversion);
+
+			if (strcmp(sysversionstr, collversionstr) != 0)
+				ereport(WARNING,
+						(errmsg("collation \"%s\" has version mismatch",
+								NameStr(collform->collname)),
+						 errdetail("The collation in the database was created using version %s, "
+								   "but the operating system provides version %s.",
+								   collversionstr, sysversionstr),
+						 errhint("Rebuild all objects affected by this collation and run "
+								 "ALTER COLLATION %s REFRESH VERSION, "
+								 "or build PostgreSQL with the right library version.",
+								 quote_qualified_identifier(get_namespace_name(collform->collnamespace),
+															NameStr(collform->collname)))));
+		}
 
 		ReleaseSysCache(tp);
-#else							/* not HAVE_LOCALE_T */
 
-		/*
-		 * For platforms that don't support locale_t, we can't do anything
-		 * with non-default collations.
-		 */
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-		errmsg("nondefault collations are not supported on this platform")));
-#endif   /* not HAVE_LOCALE_T */
+		cache_entry->locale = result;
 	}
 
 	return cache_entry->locale;
 }
 
+/*
+ * Get provider-specific collation version string for the given collation from
+ * the operating system/library.
+ *
+ * A particular provider must always either return a non-NULL string or return
+ * NULL (if it doesn't support versions).  It must not return NULL for some
+ * collcollate and not NULL for others.
+ */
+char *
+get_system_collation_version(char collprovider, const char *collcollate)
+{
+	char	   *collversion;
+
+#ifdef USE_ICU
+	if (collprovider == COLLPROVIDER_ICU)
+	{
+		UCollator  *collator;
+		UErrorCode	status;
+		UVersionInfo versioninfo;
+		char		buf[U_MAX_VERSION_STRING_LENGTH];
+
+		status = U_ZERO_ERROR;
+		collator = ucol_open(collcollate, &status);
+		if (U_FAILURE(status))
+			ereport(ERROR,
+					(errmsg("could not open collator for locale \"%s\": %s",
+							collcollate, u_errorName(status))));
+		ucol_getVersion(collator, versioninfo);
+		ucol_close(collator);
+
+		u_versionToString(versioninfo, buf);
+		collversion = pstrdup(buf);
+	}
+	else
+#endif
+		collversion = NULL;
+
+	return collversion;
+}
+
+
+#ifdef USE_ICU
+/*
+ * Converter object for converting between ICU's UChar strings and C strings
+ * in database encoding.  Since the database encoding doesn't change, we only
+ * need one of these per session.
+ */
+static UConverter *icu_converter = NULL;
+
+static void
+init_icu_converter(void)
+{
+	const char *icu_encoding_name;
+	UErrorCode	status;
+	UConverter *conv;
+
+	if (icu_converter)
+		return;
+
+	icu_encoding_name = get_encoding_name_for_icu(GetDatabaseEncoding());
+
+	status = U_ZERO_ERROR;
+	conv = ucnv_open(icu_encoding_name, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("could not open ICU converter for encoding \"%s\": %s",
+						icu_encoding_name, u_errorName(status))));
+
+	icu_converter = conv;
+}
+
+int32_t
+icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes)
+{
+	UErrorCode	status;
+	int32_t		len_uchar;
+
+	init_icu_converter();
+
+	len_uchar = 2 * nbytes;  /* max length per docs */
+	*buff_uchar = palloc(len_uchar * sizeof(**buff_uchar));
+	status = U_ZERO_ERROR;
+	len_uchar = ucnv_toUChars(icu_converter, *buff_uchar, len_uchar, buff, nbytes, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_toUChars failed: %s", u_errorName(status))));
+	return len_uchar;
+}
+
+int32_t
+icu_from_uchar(char **result, UChar *buff_uchar, int32_t len_uchar)
+{
+	UErrorCode	status;
+	int32_t		len_result;
+
+	init_icu_converter();
+
+	len_result = UCNV_GET_MAX_BYTES_FOR_STRING(len_uchar, ucnv_getMaxCharSize(icu_converter));
+	*result = palloc(len_result + 1);
+	status = U_ZERO_ERROR;
+	ucnv_fromUChars(icu_converter, *result, len_result, buff_uchar, len_uchar, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_fromUChars failed: %s", u_errorName(status))));
+	return len_result;
+}
+#endif
 
 /*
  * These functions convert from/to libc's wchar_t, *not* pg_wchar_t.
@@ -1363,6 +1532,8 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1399,10 +1570,10 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_WCSTOMBS_L
 		/* Use wcstombs_l for nondefault locales */
-		result = wcstombs_l(to, from, tolen, locale);
+		result = wcstombs_l(to, from, tolen, locale->info.lt);
 #else							/* !HAVE_WCSTOMBS_L */
 		/* We have to temporarily set the locale as current ... ugh */
-		locale_t	save_locale = uselocale(locale);
+		locale_t	save_locale = uselocale(locale->info.lt);
 
 		result = wcstombs(to, from, tolen);
 
@@ -1433,6 +1604,8 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1474,10 +1647,10 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_MBSTOWCS_L
 			/* Use mbstowcs_l for nondefault locales */
-			result = mbstowcs_l(to, str, tolen, locale);
+			result = mbstowcs_l(to, str, tolen, locale->info.lt);
 #else							/* !HAVE_MBSTOWCS_L */
 			/* We have to temporarily set the locale as current ... ugh */
-			locale_t	save_locale = uselocale(locale);
+			locale_t	save_locale = uselocale(locale->info.lt);
 
 			result = mbstowcs(to, str, tolen);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fa32e9eabe..040f8578de 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5275,7 +5275,7 @@ find_join_input_rel(PlannerInfo *root, Relids relids)
 /*
  * Check whether char is a letter (and, hence, subject to case-folding)
  *
- * In multibyte character sets, we can't use isalpha, and it does not seem
+ * In multibyte character sets or with ICU, we can't use isalpha, and it does not seem
  * worth trying to convert to wchar_t to use iswalpha.  Instead, just assume
  * any multibyte char is potentially case-varying.
  */
@@ -5287,9 +5287,11 @@ pattern_char_isalpha(char c, bool is_multibyte,
 		return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
 	else if (is_multibyte && IS_HIGHBIT_SET(c))
 		return true;
+	else if (locale && locale->provider == COLLPROVIDER_ICU)
+		return IS_HIGHBIT_SET(c) ? true : false;
 #ifdef HAVE_LOCALE_T
-	else if (locale)
-		return isalpha_l((unsigned char) c, locale);
+	else if (locale && locale->provider == COLLPROVIDER_LIBC)
+		return isalpha_l((unsigned char) c, locale->info.lt);
 #endif
 	else
 		return isalpha((unsigned char) c);
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 254379ade7..94877ae995 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1403,10 +1403,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		char		a2buf[TEXTBUFLEN];
 		char	   *a1p,
 				   *a2p;
-
-#ifdef HAVE_LOCALE_T
 		pg_locale_t mylocale = 0;
-#endif
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1421,9 +1418,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			mylocale = pg_newlocale_from_collation(collid);
-#endif
 		}
 
 		/*
@@ -1542,9 +1537,44 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		memcpy(a2p, arg2, len2);
 		a2p[len2] = '\0';
 
-#ifdef HAVE_LOCALE_T
 		if (mylocale)
-			result = strcoll_l(a1p, a2p, mylocale);
+		{
+#ifdef USE_ICU
+			if (mylocale->provider == COLLPROVIDER_ICU)
+			{
+#ifdef HAVE_UCOL_STRCOLLUTF8
+				if (GetDatabaseEncoding() == PG_UTF8)
+				{
+					UErrorCode	status;
+
+					status = U_ZERO_ERROR;
+					result = ucol_strcollUTF8(mylocale->info.icu.ucol,
+											  arg1, len1,
+											  arg2, len2,
+											  &status);
+					if (U_FAILURE(status))
+						ereport(ERROR,
+								(errmsg("collation failed: %s", u_errorName(status))));
+				}
+				else
+#endif
+				{
+					int32_t ulen1, ulen2;
+					UChar *uchar1, *uchar2;
+
+					ulen1 = icu_to_uchar(&uchar1, arg1, len1);
+					ulen2 = icu_to_uchar(&uchar2, arg2, len2);
+
+					result = ucol_strcoll(mylocale->info.icu.ucol,
+										  uchar1, ulen1,
+										  uchar2, ulen2);
+				}
+			}
+			else
+#endif
+#ifdef HAVE_LOCALE_T
+				result = strcoll_l(a1p, a2p, mylocale->info.lt);
+		}
 		else
 #endif
 			result = strcoll(a1p, a2p);
@@ -1768,10 +1798,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	bool		abbreviate = ssup->abbreviate;
 	bool		collate_c = false;
 	VarStringSortSupport *sss;
-
-#ifdef HAVE_LOCALE_T
 	pg_locale_t locale = 0;
-#endif
 
 	/*
 	 * If possible, set ssup->comparator to a function which can be used to
@@ -1826,9 +1853,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			locale = pg_newlocale_from_collation(collid);
-#endif
 		}
 	}
 
@@ -1854,7 +1879,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	 * platforms.
 	 */
 #ifndef TRUST_STRXFRM
-	if (!collate_c)
+	if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
 		abbreviate = false;
 #endif
 
@@ -2090,9 +2115,44 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 		goto done;
 	}
 
-#ifdef HAVE_LOCALE_T
 	if (sss->locale)
-		result = strcoll_l(sss->buf1, sss->buf2, sss->locale);
+	{
+#ifdef USE_ICU
+		if (sss->locale->provider == COLLPROVIDER_ICU)
+		{
+#ifdef HAVE_UCOL_STRCOLLUTF8
+			if (GetDatabaseEncoding() == PG_UTF8)
+			{
+				UErrorCode	status;
+
+				status = U_ZERO_ERROR;
+				result = ucol_strcollUTF8(sss->locale->info.icu.ucol,
+										  a1p, len1,
+										  a2p, len2,
+										  &status);
+				if (U_FAILURE(status))
+					ereport(ERROR,
+							(errmsg("collation failed: %s", u_errorName(status))));
+			}
+			else
+#endif
+			{
+				int32_t ulen1, ulen2;
+				UChar *uchar1, *uchar2;
+
+				ulen1 = icu_to_uchar(&uchar1, a1p, len1);
+				ulen2 = icu_to_uchar(&uchar2, a2p, len2);
+
+				result = ucol_strcoll(sss->locale->info.icu.ucol,
+									  uchar1, ulen1,
+									  uchar2, ulen2);
+			}
+		}
+		else
+#endif
+#ifdef HAVE_LOCALE_T
+			result = strcoll_l(sss->buf1, sss->buf2, sss->locale->info.lt);
+	}
 	else
 #endif
 		result = strcoll(sss->buf1, sss->buf2);
@@ -2200,9 +2260,14 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 	else
 	{
 		Size		bsize;
+#ifdef USE_ICU
+		int32_t		ulen = -1;
+		UChar	   *uchar;
+#endif
 
 		/*
-		 * We're not using the C collation, so fall back on strxfrm.
+		 * We're not using the C collation, so fall back on strxfrm or ICU
+		 * analogs.
 		 */
 
 		/* By convention, we use buffer 1 to store and NUL-terminate */
@@ -2222,17 +2287,66 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 			goto done;
 		}
 
-		/* Just like strcoll(), strxfrm() expects a NUL-terminated string */
 		memcpy(sss->buf1, authoritative_data, len);
+		/* Just like strcoll(), strxfrm() expects a NUL-terminated string.
+		 * Not necessary for ICU, but doesn't hurt. */
 		sss->buf1[len] = '\0';
 		sss->last_len1 = len;
 
+#ifdef USE_ICU
+		/* When using ICU and not UTF8, convert string to UChar. */
+		if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU &&
+			GetDatabaseEncoding() != PG_UTF8)
+			ulen = icu_to_uchar(&uchar, sss->buf1, len);
+#endif
+
+		/*
+		 * Loop: Call strxfrm() or ucol_getSortKey(), possibly enlarge buffer,
+		 * and try again.  Both of these functions have the result buffer
+		 * content undefined if the result did not fit, so we need to retry
+		 * until everything fits, even though we only need the first few bytes
+		 * in the end.  When using ucol_nextSortKeyPart(), however, we only
+		 * ask for as many bytes as we actually need.
+		 */
 		for (;;)
 		{
+#ifdef USE_ICU
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+			{
+				/*
+				 * When using UTF8, use the iteration interface so we only
+				 * need to produce as many bytes as we actually need.
+				 */
+				if (GetDatabaseEncoding() == PG_UTF8)
+				{
+					UCharIterator iter;
+					uint32_t	state[2];
+					UErrorCode	status;
+
+					uiter_setUTF8(&iter, sss->buf1, len);
+					state[0] = state[1] = 0;  /* won't need that again */
+					status = U_ZERO_ERROR;
+					bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol,
+												 &iter,
+												 state,
+												 (uint8_t *) sss->buf2,
+												 Min(sizeof(Datum), sss->buflen2),
+												 &status);
+					if (U_FAILURE(status))
+						ereport(ERROR,
+								(errmsg("sort key generation failed: %s", u_errorName(status))));
+				}
+				else
+					bsize = ucol_getSortKey(sss->locale->info.icu.ucol,
+											uchar, ulen,
+											(uint8_t *) sss->buf2, sss->buflen2);
+			}
+			else
+#endif
 #ifdef HAVE_LOCALE_T
-			if (sss->locale)
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_LIBC)
 				bsize = strxfrm_l(sss->buf2, sss->buf1,
-								  sss->buflen2, sss->locale);
+								  sss->buflen2, sss->locale->info.lt);
 			else
 #endif
 				bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
@@ -2242,8 +2356,7 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 				break;
 
 			/*
-			 * The C standard states that the contents of the buffer is now
-			 * unspecified.  Grow buffer, and retry.
+			 * Grow buffer and retry.
 			 */
 			pfree(sss->buf2);
 			sss->buflen2 = Max(bsize + 1,
diff --git a/src/backend/utils/mb/encnames.c b/src/backend/utils/mb/encnames.c
index 11099b844f..444eec25b5 100644
--- a/src/backend/utils/mb/encnames.c
+++ b/src/backend/utils/mb/encnames.c
@@ -403,6 +403,82 @@ const pg_enc2gettext pg_enc2gettext_tbl[] =
 };
 
 
+#ifndef FRONTEND
+
+/*
+ * Table of encoding names for ICU
+ *
+ * Reference: <https://ssl.icu-project.org/icu-bin/convexp>
+ *
+ * NULL entries are not supported by ICU, or their mapping is unclear.
+ */
+static const char * const pg_enc2icu_tbl[] =
+{
+	NULL,					/* PG_SQL_ASCII */
+	"EUC-JP",				/* PG_EUC_JP */
+	"EUC-CN",				/* PG_EUC_CN */
+	"EUC-KR",				/* PG_EUC_KR */
+	"EUC-TW",				/* PG_EUC_TW */
+	NULL,					/* PG_EUC_JIS_2004 */
+	"UTF-8",				/* PG_UTF8 */
+	NULL,					/* PG_MULE_INTERNAL */
+	"ISO-8859-1",			/* PG_LATIN1 */
+	"ISO-8859-2",			/* PG_LATIN2 */
+	"ISO-8859-3",			/* PG_LATIN3 */
+	"ISO-8859-4",			/* PG_LATIN4 */
+	"ISO-8859-9",			/* PG_LATIN5 */
+	"ISO-8859-10",			/* PG_LATIN6 */
+	"ISO-8859-13",			/* PG_LATIN7 */
+	"ISO-8859-14",			/* PG_LATIN8 */
+	"ISO-8859-15",			/* PG_LATIN9 */
+	NULL,					/* PG_LATIN10 */
+	"CP1256",				/* PG_WIN1256 */
+	"CP1258",				/* PG_WIN1258 */
+	"CP866",				/* PG_WIN866 */
+	NULL,					/* PG_WIN874 */
+	"KOI8-R",				/* PG_KOI8R */
+	"CP1251",				/* PG_WIN1251 */
+	"CP1252",				/* PG_WIN1252 */
+	"ISO-8859-5",			/* PG_ISO_8859_5 */
+	"ISO-8859-6",			/* PG_ISO_8859_6 */
+	"ISO-8859-7",			/* PG_ISO_8859_7 */
+	"ISO-8859-8",			/* PG_ISO_8859_8 */
+	"CP1250",				/* PG_WIN1250 */
+	"CP1253",				/* PG_WIN1253 */
+	"CP1254",				/* PG_WIN1254 */
+	"CP1255",				/* PG_WIN1255 */
+	"CP1257",				/* PG_WIN1257 */
+	"KOI8-U",				/* PG_KOI8U */
+};
+
+bool
+is_encoding_supported_by_icu(int encoding)
+{
+	return (pg_enc2icu_tbl[encoding] != NULL);
+}
+
+const char *
+get_encoding_name_for_icu(int encoding)
+{
+	const char *icu_encoding_name;
+
+	StaticAssertStmt(lengthof(pg_enc2icu_tbl) == PG_ENCODING_BE_LAST + 1,
+					 "pg_enc2icu_tbl incomplete");
+
+	icu_encoding_name = pg_enc2icu_tbl[encoding];
+
+	if (!icu_encoding_name)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("encoding \"%s\" not supported by ICU",
+						pg_encoding_to_char(encoding))));
+
+	return icu_encoding_name;
+}
+
+#endif /* not FRONTEND */
+
+
 /* ----------
  * Encoding checks, for error returns -1 else encoding id
  * ----------
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 540427a892..3078454837 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -62,6 +62,7 @@
 
 #include "catalog/catalog.h"
 #include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
 #include "common/file_utils.h"
 #include "common/restricted_token.h"
 #include "common/username.h"
@@ -1618,7 +1619,7 @@ setup_collation(FILE *cmdfd)
 	PG_CMD_PUTS("SELECT pg_import_system_collations(if_not_exists => false, schema => 'pg_catalog');\n\n");
 
 	/* Add an SQL-standard name */
-	PG_CMD_PRINTF2("INSERT INTO pg_collation (collname, collnamespace, collowner, collencoding, collcollate, collctype) VALUES ('ucs_basic', 'pg_catalog'::regnamespace, %u, %d, 'C', 'C');\n\n", BOOTSTRAP_SUPERUSERID, PG_UTF8);
+	PG_CMD_PRINTF3("INSERT INTO pg_collation (collname, collnamespace, collowner, collprovider, collencoding, collcollate, collctype) VALUES ('ucs_basic', 'pg_catalog'::regnamespace, %u, '%c', %d, 'C', 'C');\n\n", BOOTSTRAP_SUPERUSERID, COLLPROVIDER_LIBC, PG_UTF8);
 }
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7364a12c25..23917b7fa4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -12760,8 +12760,10 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	PQExpBuffer delq;
 	PQExpBuffer labelq;
 	PGresult   *res;
+	int			i_collprovider;
 	int			i_collcollate;
 	int			i_collctype;
+	const char *collprovider;
 	const char *collcollate;
 	const char *collctype;
 
@@ -12778,18 +12780,30 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	selectSourceSchema(fout, collinfo->dobj.namespace->dobj.name);
 
 	/* Get collation-specific details */
-	appendPQExpBuffer(query, "SELECT "
-					  "collcollate, "
-					  "collctype "
-					  "FROM pg_catalog.pg_collation c "
-					  "WHERE c.oid = '%u'::pg_catalog.oid",
-					  collinfo->dobj.catId.oid);
+	if (fout->remoteVersion >= 100000)
+		appendPQExpBuffer(query, "SELECT "
+						  "collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
+	else
+		appendPQExpBuffer(query, "SELECT "
+						  "'p'::char AS collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
 
 	res = ExecuteSqlQueryForSingleRow(fout, query->data);
 
+	i_collprovider = PQfnumber(res, "collprovider");
 	i_collcollate = PQfnumber(res, "collcollate");
 	i_collctype = PQfnumber(res, "collctype");
 
+	collprovider = PQgetvalue(res, 0, i_collprovider);
 	collcollate = PQgetvalue(res, 0, i_collcollate);
 	collctype = PQgetvalue(res, 0, i_collctype);
 
@@ -12801,11 +12815,32 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	appendPQExpBuffer(delq, ".%s;\n",
 					  fmtId(collinfo->dobj.name));
 
-	appendPQExpBuffer(q, "CREATE COLLATION %s (lc_collate = ",
+	appendPQExpBuffer(q, "CREATE COLLATION %s (",
 					  fmtId(collinfo->dobj.name));
-	appendStringLiteralAH(q, collcollate, fout);
-	appendPQExpBufferStr(q, ", lc_ctype = ");
-	appendStringLiteralAH(q, collctype, fout);
+
+	appendPQExpBufferStr(q, "provider = ");
+	if (collprovider[0] == 'c')
+		appendStringLiteralAH(q, "libc", fout);
+	else if (collprovider[0] == 'i')
+		appendStringLiteralAH(q, "icu", fout);
+	else
+		exit_horribly(NULL,
+					  "unrecognized collation provider: %s\n",
+					  collprovider);
+
+	if (strcmp(collcollate, collctype) == 0)
+	{
+		appendPQExpBufferStr(q, ", locale = ");
+		appendStringLiteralAH(q, collcollate, fout);
+	}
+	else
+	{
+		appendPQExpBufferStr(q, ", lc_collate = ");
+		appendStringLiteralAH(q, collcollate, fout);
+		appendPQExpBufferStr(q, ", lc_ctype = ");
+		appendStringLiteralAH(q, collctype, fout);
+	}
+
 	appendPQExpBufferStr(q, ");\n");
 
 	appendPQExpBuffer(labelq, "COLLATION %s", fmtId(collinfo->dobj.name));
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index e2e4cbcc08..ed2eeea81b 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3701,7 +3701,7 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false};
 
 	if (pset.sversion < 90100)
 	{
@@ -3725,6 +3725,11 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
 					  gettext_noop("Collate"),
 					  gettext_noop("Ctype"));
 
+	if (pset.sversion >= 100000)
+		appendPQExpBuffer(&buf,
+						  ",\n       CASE c.collprovider WHEN 'd' THEN 'default' WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\"",
+						  gettext_noop("Provider"));
+
 	if (verbose)
 		appendPQExpBuffer(&buf,
 						  ",\n       pg_catalog.obj_description(c.oid, 'pg_collation') AS \"%s\"",
diff --git a/src/include/catalog/pg_collation.h b/src/include/catalog/pg_collation.h
index 30c87e004e..8edd8aa066 100644
--- a/src/include/catalog/pg_collation.h
+++ b/src/include/catalog/pg_collation.h
@@ -34,9 +34,13 @@ CATALOG(pg_collation,3456)
 	NameData	collname;		/* collation name */
 	Oid			collnamespace;	/* OID of namespace containing collation */
 	Oid			collowner;		/* owner of collation */
+	char		collprovider;	/* see constants below */
 	int32		collencoding;	/* encoding for this collation; -1 = "all" */
 	NameData	collcollate;	/* LC_COLLATE setting */
 	NameData	collctype;		/* LC_CTYPE setting */
+#ifdef CATALOG_VARLEN			/* variable-length fields start here */
+	text		collversion;	/* provider-dependent version of collation data */
+#endif
 } FormData_pg_collation;
 
 /* ----------------
@@ -50,27 +54,34 @@ typedef FormData_pg_collation *Form_pg_collation;
  *		compiler constants for pg_collation
  * ----------------
  */
-#define Natts_pg_collation				6
+#define Natts_pg_collation				8
 #define Anum_pg_collation_collname		1
 #define Anum_pg_collation_collnamespace 2
 #define Anum_pg_collation_collowner		3
-#define Anum_pg_collation_collencoding	4
-#define Anum_pg_collation_collcollate	5
-#define Anum_pg_collation_collctype		6
+#define Anum_pg_collation_collprovider	4
+#define Anum_pg_collation_collencoding	5
+#define Anum_pg_collation_collcollate	6
+#define Anum_pg_collation_collctype		7
+#define Anum_pg_collation_collversion	8
 
 /* ----------------
  *		initial contents of pg_collation
  * ----------------
  */
 
-DATA(insert OID = 100 ( default		PGNSP PGUID -1 "" "" ));
+DATA(insert OID = 100 ( default		PGNSP PGUID d -1 "" "" 0 ));
 DESCR("database's default collation");
 #define DEFAULT_COLLATION_OID	100
-DATA(insert OID = 950 ( C			PGNSP PGUID -1 "C" "C" ));
+DATA(insert OID = 950 ( C			PGNSP PGUID c -1 "C" "C" 0 ));
 DESCR("standard C collation");
 #define C_COLLATION_OID			950
-DATA(insert OID = 951 ( POSIX		PGNSP PGUID -1 "POSIX" "POSIX" ));
+DATA(insert OID = 951 ( POSIX		PGNSP PGUID c -1 "POSIX" "POSIX" 0 ));
 DESCR("standard POSIX collation");
 #define POSIX_COLLATION_OID		951
 
+
+#define COLLPROVIDER_DEFAULT	'd'
+#define COLLPROVIDER_ICU		'i'
+#define COLLPROVIDER_LIBC		'c'
+
 #endif   /* PG_COLLATION_H */
diff --git a/src/include/catalog/pg_collation_fn.h b/src/include/catalog/pg_collation_fn.h
index 482ba7920e..2a50e54124 100644
--- a/src/include/catalog/pg_collation_fn.h
+++ b/src/include/catalog/pg_collation_fn.h
@@ -16,6 +16,7 @@
 
 extern Oid CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
 				const char *collcollate, const char *collctype,
 				bool if_not_exists);
diff --git a/src/include/commands/collationcmds.h b/src/include/commands/collationcmds.h
index 3b2fcb8271..df5623ccb6 100644
--- a/src/include/commands/collationcmds.h
+++ b/src/include/commands/collationcmds.h
@@ -20,5 +20,6 @@
 
 extern ObjectAddress DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_exists);
 extern void IsThereCollationInNamespace(const char *collname, Oid nspOid);
+extern ObjectAddress AlterCollation(AlterCollationStmt *stmt);
 
 #endif   /* COLLATIONCMDS_H */
diff --git a/src/include/mb/pg_wchar.h b/src/include/mb/pg_wchar.h
index ceb5695924..5e279b319a 100644
--- a/src/include/mb/pg_wchar.h
+++ b/src/include/mb/pg_wchar.h
@@ -333,6 +333,12 @@ typedef struct pg_enc2gettext
 extern const pg_enc2gettext pg_enc2gettext_tbl[];
 
 /*
+ * Encoding names for ICU
+ */
+extern bool is_encoding_supported_by_icu(int encoding);
+extern const char *get_encoding_name_for_icu(int encoding);
+
+/*
  * pg_wchar stuff
  */
 typedef int (*mb2wchar_with_len_converter) (const unsigned char *from,
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 95dd8baadd..8020312f13 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -416,6 +416,7 @@ typedef enum NodeTag
 	T_CreateSubscriptionStmt,
 	T_AlterSubscriptionStmt,
 	T_DropSubscriptionStmt,
+	T_AlterCollationStmt,
 
 	/*
 	 * TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 5afc3ebea0..3b96bb84e7 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1694,6 +1694,17 @@ typedef struct AlterTableCmd	/* one subcommand of an ALTER TABLE */
 
 
 /* ----------------------
+ * Alter Collation
+ * ----------------------
+ */
+typedef struct AlterCollationStmt
+{
+	NodeTag		type;
+	List	   *collname;
+} AlterCollationStmt;
+
+
+/* ----------------------
  *	Alter Domain
  *
  * The fields are used in different ways by the different variants of
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index b9dfdd41c1..0b48f30379 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -618,6 +618,9 @@
 /* Define to 1 if you have the external array `tzname'. */
 #undef HAVE_TZNAME
 
+/* Define to 1 if you have the `ucol_strcollUTF8' function. */
+#undef HAVE_UCOL_STRCOLLUTF8
+
 /* Define to 1 if you have the <ucred.h> header file. */
 #undef HAVE_UCRED_H
 
@@ -831,6 +834,9 @@
    (--enable-float8-byval) */
 #undef USE_FLOAT8_BYVAL
 
+/* Define to build with ICU support. (--with-icu) */
+#undef USE_ICU
+
 /* Define to 1 if you want 64-bit integer timestamp and interval support.
    (--enable-integer-datetimes) */
 #undef USE_INTEGER_DATETIMES
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index be91f9b88c..bbd1b51cf9 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -16,6 +16,9 @@
 #if defined(LOCALE_T_IN_XLOCALE) || defined(WCSTOMBS_L_IN_XLOCALE)
 #include <xlocale.h>
 #endif
+#ifdef USE_ICU
+#include <unicode/ucol.h>
+#endif
 
 #include "utils/guc.h"
 
@@ -62,17 +65,36 @@ extern void cache_locale_time(void);
  * We define our own wrapper around locale_t so we can keep the same
  * function signatures for all builds, while not having to create a
  * fake version of the standard type locale_t in the global namespace.
- * The fake version of pg_locale_t can be checked for truth; that's
- * about all it will be needed for.
+ * pg_locale_t is occasionally checked for truth, so make it a pointer.
  */
+struct pg_locale_t
+{
+	char	provider;
+	union
+	{
 #ifdef HAVE_LOCALE_T
-typedef locale_t pg_locale_t;
-#else
-typedef int pg_locale_t;
+		locale_t lt;
+#endif
+#ifdef USE_ICU
+		struct {
+			const char *locale;
+			UCollator *ucol;
+		} icu;
 #endif
+	} info;
+};
+
+typedef struct pg_locale_t *pg_locale_t;
 
 extern pg_locale_t pg_newlocale_from_collation(Oid collid);
 
+extern char *get_system_collation_version(char collprovider, const char *collcollate);
+
+#ifdef USE_ICU
+extern int32_t icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes);
+extern int32_t icu_from_uchar(char **result, UChar *buff_uchar, int32_t len_uchar);
+#endif
+
 /* These functions convert from/to libc's wchar_t, *not* pg_wchar_t */
 #ifdef USE_WIDE_UPPER_LOWER
 extern size_t wchar2char(char *to, const wchar_t *from, size_t tolen,
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index b923ea1420..a747facb9a 100644
--- a/src/test/regress/GNUmakefile
+++ b/src/test/regress/GNUmakefile
@@ -125,6 +125,9 @@ tablespace-setup:
 ##
 
 REGRESS_OPTS = --dlpath=. $(EXTRA_REGRESS_OPTS)
+ifeq ($(with_icu),yes)
+override EXTRA_TESTS := collate.icu $(EXTRA_TESTS)
+endif
 
 check: all tablespace-setup
 	$(pg_regress_check) $(REGRESS_OPTS) --schedule=$(srcdir)/parallel_schedule $(MAXCONNOPT) $(EXTRA_TESTS)
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.icu.out
similarity index 80%
copy from src/test/regress/expected/collate.linux.utf8.out
copy to src/test/regress/expected/collate.icu.out
index 293e78641e..0943e706bb 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.icu.out
@@ -1,54 +1,54 @@
 /*
- * This test is for Linux/glibc systems and assumes that a full set of
- * locales is installed.  It must be run in a database with UTF-8 encoding,
- * because other encodings don't support all the characters used.
+ * This test is for ICU collations.
  */
 SET client_encoding TO UTF8;
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
 CREATE TABLE collate_test1 (
     a int,
-    b text COLLATE "en_US" NOT NULL
+    b text COLLATE "en%icu" NOT NULL
 );
 \d collate_test1
-           Table "public.collate_test1"
+        Table "collate_tests.collate_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
- b      | text    | en_US     | not null | 
+ b      | text    | en%icu    | not null | 
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "ja_JP.eucjp"
+    b text COLLATE "ja_JP.eucjp%icu"
 );
-ERROR:  collation "ja_JP.eucjp" for encoding "UTF8" does not exist
-LINE 3:     b text COLLATE "ja_JP.eucjp"
+ERROR:  collation "ja_JP.eucjp%icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "ja_JP.eucjp%icu"
                    ^
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "foo"
+    b text COLLATE "foo%icu"
 );
-ERROR:  collation "foo" for encoding "UTF8" does not exist
-LINE 3:     b text COLLATE "foo"
+ERROR:  collation "foo%icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "foo%icu"
                    ^
 CREATE TABLE collate_test_fail (
-    a int COLLATE "en_US",
+    a int COLLATE "en%icu",
     b text
 );
 ERROR:  collations are not supported by type integer
-LINE 2:     a int COLLATE "en_US",
+LINE 2:     a int COLLATE "en%icu",
                   ^
 CREATE TABLE collate_test_like (
     LIKE collate_test1
 );
 \d collate_test_like
-         Table "public.collate_test_like"
+      Table "collate_tests.collate_test_like"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
- b      | text    | en_US     | not null | 
+ b      | text    | en%icu    | not null | 
 
 CREATE TABLE collate_test2 (
     a int,
-    b text COLLATE "sv_SE"
+    b text COLLATE "sv%icu"
 );
 CREATE TABLE collate_test3 (
     a int,
@@ -106,12 +106,12 @@ SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
  3 | bbc
 (2 rows)
 
-SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US";
-ERROR:  collation mismatch between explicit collations "C" and "en_US"
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en%icu";
+ERROR:  collation mismatch between explicit collations "C" and "en%icu"
 LINE 1: ...* FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "e...
                                                              ^
-CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE";
-CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE"; -- fails
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv%icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv%icu"; -- fails
 ERROR:  collations are not supported by type integer
 CREATE TABLE collate_test4 (
     a int,
@@ -129,7 +129,7 @@ SELECT a, b FROM collate_test4 ORDER BY b;
 
 CREATE TABLE collate_test5 (
     a int,
-    b testdomain_sv COLLATE "en_US"
+    b testdomain_sv COLLATE "en%icu"
 );
 INSERT INTO collate_test5 SELECT * FROM collate_test1;
 SELECT a, b FROM collate_test5 ORDER BY b;
@@ -206,13 +206,13 @@ SELECT * FROM collate_test3 ORDER BY b;
 (4 rows)
 
 -- constant expression folding
-SELECT 'bbc' COLLATE "en_US" > 'Ã¤bc' COLLATE "en_US" AS "true";
+SELECT 'bbc' COLLATE "en%icu" > 'Ã¤bc' COLLATE "en%icu" AS "true";
  true 
 ------
  t
 (1 row)
 
-SELECT 'bbc' COLLATE "sv_SE" > 'Ã¤bc' COLLATE "sv_SE" AS "false";
+SELECT 'bbc' COLLATE "sv%icu" > 'Ã¤bc' COLLATE "sv%icu" AS "false";
  false 
 -------
  f
@@ -221,8 +221,8 @@ SELECT 'bbc' COLLATE "sv_SE" > 'Ã¤bc' COLLATE "sv_SE" AS "false";
 -- upper/lower
 CREATE TABLE collate_test10 (
     a int,
-    x text COLLATE "en_US",
-    y text COLLATE "tr_TR"
+    x text COLLATE "en%icu",
+    y text COLLATE "tr%icu"
 );
 INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
 SELECT a, lower(x), lower(y), upper(x), upper(y), initcap(x), initcap(y) FROM collate_test10;
@@ -290,25 +290,25 @@ SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
  4 | ABC
 (4 rows)
 
-SELECT 'TÃ¼rkiye' COLLATE "en_US" ILIKE '%KI%' AS "true";
+SELECT 'TÃ¼rkiye' COLLATE "en%icu" ILIKE '%KI%' AS "true";
  true 
 ------
  t
 (1 row)
 
-SELECT 'TÃ¼rkiye' COLLATE "tr_TR" ILIKE '%KI%' AS "false";
+SELECT 'TÃ¼rkiye' COLLATE "tr%icu" ILIKE '%KI%' AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "en_US" AS "false";
+SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "en%icu" AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "tr_TR" AS "true";
+SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "tr%icu" AS "true";
  true 
 ------
  t
@@ -364,30 +364,62 @@ SELECT * FROM collate_test1 WHERE b ~* 'bc';
  4 | ABC
 (4 rows)
 
-SELECT 'TÃ¼rkiye' COLLATE "en_US" ~* 'KI' AS "true";
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en%icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'Ã¤bÃ§'), (10, 'ÃBÃ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+  b  | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space 
+-----+----------+----------+----------+----------+----------+----------+----------+----------+----------
+ abc | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ABC | t        | t        | f        | f        | t        | t        | t        | f        | f
+ 123 | f        | f        | f        | t        | t        | t        | t        | f        | f
+ ab1 | f        | f        | f        | f        | t        | t        | t        | f        | f
+ a1! | f        | f        | f        | f        | f        | t        | t        | f        | f
+ a c | f        | f        | f        | f        | f        | f        | t        | f        | f
+ !.; | f        | f        | f        | f        | f        | t        | t        | t        | f
+     | f        | f        | f        | f        | f        | f        | t        | f        | t
+ Ã¤bÃ§ | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ÃBÃ | t        | t        | f        | f        | t        | t        | t        | f        | f
+(10 rows)
+
+SELECT 'TÃ¼rkiye' COLLATE "en%icu" ~* 'KI' AS "true";
+ true 
+------
+ t
+(1 row)
+
+SELECT 'TÃ¼rkiye' COLLATE "tr%icu" ~* 'KI' AS "true";  -- true with ICU
  true 
 ------
  t
 (1 row)
 
-SELECT 'TÃ¼rkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
+SELECT 'bÄ±t' ~* 'BIT' COLLATE "en%icu" AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bÄ±t' ~* 'BIT' COLLATE "en_US" AS "false";
+SELECT 'bÄ±t' ~* 'BIT' COLLATE "tr%icu" AS "false";  -- false with ICU
  false 
 -------
  f
 (1 row)
 
-SELECT 'bÄ±t' ~* 'BIT' COLLATE "tr_TR" AS "true";
- true 
-------
- t
-(1 row)
-
 -- The following actually exercises the selectivity estimation for ~*.
 SELECT relname FROM pg_class WHERE relname ~* '^abc';
  relname 
@@ -402,7 +434,7 @@ SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
  01 NIS 2010
 (1 row)
 
-SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr%icu");
    to_char   
 -------------
  01 NÄ°S 2010
@@ -693,7 +725,7 @@ SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- ok
 (8 rows)
 
 SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en%icu" and "C"
 LINE 1: SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collat...
                                                        ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
@@ -707,12 +739,12 @@ SELECT a, b COLLATE "C" FROM collate_test1 UNION SELECT a, b FROM collate_test3
 (4 rows)
 
 SELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en%icu" and "C"
 LINE 1: ...ELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM col...
                                                              ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
 SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en%icu" and "C"
 LINE 1: SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM colla...
                                                         ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
@@ -731,18 +763,18 @@ select x || y from collate_test10; -- ok, because || is not collation aware
 (2 rows)
 
 select x, y from collate_test10 order by x || y; -- not so ok
-ERROR:  collation mismatch between implicit collations "en_US" and "tr_TR"
+ERROR:  collation mismatch between implicit collations "en%icu" and "tr%icu"
 LINE 1: select x, y from collate_test10 order by x || y;
                                                       ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
 -- collation mismatch between recursive and non-recursive term
 WITH RECURSIVE foo(x) AS
-   (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+   (SELECT x FROM (VALUES('a' COLLATE "en%icu"),('b')) t(x)
    UNION ALL
-   SELECT (x || 'c') COLLATE "de_DE" FROM foo WHERE length(x) < 10)
+   SELECT (x || 'c') COLLATE "de%icu" FROM foo WHERE length(x) < 10)
 SELECT * FROM foo;
-ERROR:  recursive query "foo" column 1 has collation "en_US" in non-recursive term but collation "de_DE" overall
-LINE 2:    (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+ERROR:  recursive query "foo" column 1 has collation "en%icu" in non-recursive term but collation "de%icu" overall
+LINE 2:    (SELECT x FROM (VALUES('a' COLLATE "en%icu"),('b')) t(x)
                    ^
 HINT:  Use the COLLATE clause to set the collation of the non-recursive term.
 -- casting
@@ -843,7 +875,7 @@ begin
   return xx < yy;
 end
 $$;
-SELECT mylt2('a', 'B' collate "en_US") as t, mylt2('a', 'B' collate "C") as f;
+SELECT mylt2('a', 'B' collate "en%icu") as t, mylt2('a', 'B' collate "C") as f;
  t | f 
 ---+---
  t | f
@@ -957,29 +989,23 @@ CREATE SCHEMA test_schema;
 -- We need to do this this way to cope with varying names for encodings:
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test0 (locale = ' ||
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
           quote_literal(current_setting('lc_collate')) || ');';
 END
 $$;
 CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
-ERROR:  collation "test0" for encoding "UTF8" already exists
-CREATE COLLATION IF NOT EXISTS test0 FROM "C"; -- ok, skipped
-NOTICE:  collation "test0" for encoding "UTF8" already exists, skipping
-CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
-NOTICE:  collation "test0" for encoding "UTF8" already exists, skipping
+ERROR:  collation "test0" already exists
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
           quote_literal(current_setting('lc_collate')) ||
           ', lc_ctype = ' ||
           quote_literal(current_setting('lc_ctype')) || ');';
 END
 $$;
-CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
 ERROR:  parameter "lc_ctype" must be specified
-CREATE COLLATION testx (locale = 'nonsense'); -- fail
-ERROR:  could not create locale "nonsense": No such file or directory
-DETAIL:  The operating system could not find any locale data for the locale name "nonsense".
+CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails with ICU */  DROP COLLATION testx;
 CREATE COLLATION test4 FROM nonsense;
 ERROR:  collation "nonsense" for encoding "UTF8" does not exist
 CREATE COLLATION test5 FROM test0;
@@ -993,7 +1019,7 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
 
 ALTER COLLATION test1 RENAME TO test11;
 ALTER COLLATION test0 RENAME TO test11; -- fail
-ERROR:  collation "test11" for encoding "UTF8" already exists in schema "public"
+ERROR:  collation "test11" already exists in schema "collate_tests"
 ALTER COLLATION test1 RENAME TO test22; -- fail
 ERROR:  collation "test1" for encoding "UTF8" does not exist
 ALTER COLLATION test11 OWNER TO regress_test_role;
@@ -1005,11 +1031,11 @@ SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
     FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
     WHERE collname LIKE 'test%'
     ORDER BY 1;
- collname |   nspname   | obj_description 
-----------+-------------+-----------------
- test0    | public      | US English
- test11   | test_schema | 
- test5    | public      | 
+ collname |    nspname    | obj_description 
+----------+---------------+-----------------
+ test0    | collate_tests | US English
+ test11   | test_schema   | 
+ test5    | collate_tests | 
 (3 rows)
 
 DROP COLLATION test0, test_schema.test11, test5;
@@ -1024,6 +1050,9 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
 
 DROP SCHEMA test_schema;
 DROP ROLE regress_test_role;
+-- ALTER
+ALTER COLLATION "en%icu" REFRESH VERSION;
+NOTICE:  version has not changed
 -- dependencies
 CREATE COLLATION test0 FROM "C";
 CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
@@ -1048,13 +1077,13 @@ drop cascades to composite type collate_dep_test2 column y
 drop cascades to view collate_dep_test3
 drop cascades to index collate_dep_test4i
 \d collate_dep_test1
-         Table "public.collate_dep_test1"
+      Table "collate_tests.collate_dep_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
 
 \d collate_dep_test2
-     Composite type "public.collate_dep_test2"
+ Composite type "collate_tests.collate_dep_test2"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  x      | integer |           |          | 
@@ -1063,7 +1092,7 @@ DROP TABLE collate_dep_test1, collate_dep_test4t;
 DROP TYPE collate_dep_test2;
 -- test range types and collations
 create type textrange_c as range(subtype=text, collation="C");
-create type textrange_en_us as range(subtype=text, collation="en_US");
+create type textrange_en_us as range(subtype=text, collation="en%icu");
 select textrange_c('A','Z') @> 'b'::text;
  ?column? 
 ----------
@@ -1078,3 +1107,24 @@ select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
+NOTICE:  drop cascades to 18 other objects
+DETAIL:  drop cascades to table collate_test1
+drop cascades to table collate_test_like
+drop cascades to table collate_test2
+drop cascades to table collate_test3
+drop cascades to type testdomain_sv
+drop cascades to table collate_test4
+drop cascades to table collate_test5
+drop cascades to table collate_test10
+drop cascades to table collate_test6
+drop cascades to view collview1
+drop cascades to view collview2
+drop cascades to view collview3
+drop cascades to type testdomain
+drop cascades to function mylt(text,text)
+drop cascades to function mylt_noninline(text,text)
+drop cascades to function mylt_plpgsql(text,text)
+drop cascades to function mylt2(text,text)
+drop cascades to function dup(anyelement)
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out
index 293e78641e..332fb4dbbd 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.linux.utf8.out
@@ -4,12 +4,14 @@
  * because other encodings don't support all the characters used.
  */
 SET client_encoding TO UTF8;
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
 CREATE TABLE collate_test1 (
     a int,
     b text COLLATE "en_US" NOT NULL
 );
 \d collate_test1
-           Table "public.collate_test1"
+        Table "collate_tests.collate_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
@@ -40,7 +42,7 @@ CREATE TABLE collate_test_like (
     LIKE collate_test1
 );
 \d collate_test_like
-         Table "public.collate_test_like"
+      Table "collate_tests.collate_test_like"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
@@ -364,6 +366,38 @@ SELECT * FROM collate_test1 WHERE b ~* 'bc';
  4 | ABC
 (4 rows)
 
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'Ã¤bÃ§'), (10, 'ÃBÃ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+  b  | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space 
+-----+----------+----------+----------+----------+----------+----------+----------+----------+----------
+ abc | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ABC | t        | t        | f        | f        | t        | t        | t        | f        | f
+ 123 | f        | f        | f        | t        | t        | t        | t        | f        | f
+ ab1 | f        | f        | f        | f        | t        | t        | t        | f        | f
+ a1! | f        | f        | f        | f        | f        | t        | t        | f        | f
+ a c | f        | f        | f        | f        | f        | f        | t        | f        | f
+ !.; | f        | f        | f        | f        | f        | t        | t        | t        | f
+     | f        | f        | f        | f        | f        | f        | t        | f        | t
+ Ã¤bÃ§ | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ÃBÃ | t        | t        | f        | f        | t        | t        | t        | f        | f
+(10 rows)
+
 SELECT 'TÃ¼rkiye' COLLATE "en_US" ~* 'KI' AS "true";
  true 
 ------
@@ -993,7 +1027,7 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
 
 ALTER COLLATION test1 RENAME TO test11;
 ALTER COLLATION test0 RENAME TO test11; -- fail
-ERROR:  collation "test11" for encoding "UTF8" already exists in schema "public"
+ERROR:  collation "test11" for encoding "UTF8" already exists in schema "collate_tests"
 ALTER COLLATION test1 RENAME TO test22; -- fail
 ERROR:  collation "test1" for encoding "UTF8" does not exist
 ALTER COLLATION test11 OWNER TO regress_test_role;
@@ -1005,11 +1039,11 @@ SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
     FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
     WHERE collname LIKE 'test%'
     ORDER BY 1;
- collname |   nspname   | obj_description 
-----------+-------------+-----------------
- test0    | public      | US English
- test11   | test_schema | 
- test5    | public      | 
+ collname |    nspname    | obj_description 
+----------+---------------+-----------------
+ test0    | collate_tests | US English
+ test11   | test_schema   | 
+ test5    | collate_tests | 
 (3 rows)
 
 DROP COLLATION test0, test_schema.test11, test5;
@@ -1024,6 +1058,9 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
 
 DROP SCHEMA test_schema;
 DROP ROLE regress_test_role;
+-- ALTER
+ALTER COLLATION "en_US" REFRESH VERSION;
+NOTICE:  version has not changed
 -- dependencies
 CREATE COLLATION test0 FROM "C";
 CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
@@ -1048,13 +1085,13 @@ drop cascades to composite type collate_dep_test2 column y
 drop cascades to view collate_dep_test3
 drop cascades to index collate_dep_test4i
 \d collate_dep_test1
-         Table "public.collate_dep_test1"
+      Table "collate_tests.collate_dep_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
 
 \d collate_dep_test2
-     Composite type "public.collate_dep_test2"
+ Composite type "collate_tests.collate_dep_test2"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  x      | integer |           |          | 
@@ -1078,3 +1115,24 @@ select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
+NOTICE:  drop cascades to 18 other objects
+DETAIL:  drop cascades to table collate_test1
+drop cascades to table collate_test_like
+drop cascades to table collate_test2
+drop cascades to table collate_test3
+drop cascades to type testdomain_sv
+drop cascades to table collate_test4
+drop cascades to table collate_test5
+drop cascades to table collate_test10
+drop cascades to table collate_test6
+drop cascades to view collview1
+drop cascades to view collview2
+drop cascades to view collview3
+drop cascades to type testdomain
+drop cascades to function mylt(text,text)
+drop cascades to function mylt_noninline(text,text)
+drop cascades to function mylt_plpgsql(text,text)
+drop cascades to function mylt2(text,text)
+drop cascades to function dup(anyelement)
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.icu.sql
similarity index 82%
copy from src/test/regress/sql/collate.linux.utf8.sql
copy to src/test/regress/sql/collate.icu.sql
index c349cbde2b..cf092730ec 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.icu.sql
@@ -1,31 +1,32 @@
 /*
- * This test is for Linux/glibc systems and assumes that a full set of
- * locales is installed.  It must be run in a database with UTF-8 encoding,
- * because other encodings don't support all the characters used.
+ * This test is for ICU collations.
  */
 
 SET client_encoding TO UTF8;
 
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
+
 
 CREATE TABLE collate_test1 (
     a int,
-    b text COLLATE "en_US" NOT NULL
+    b text COLLATE "en%icu" NOT NULL
 );
 
 \d collate_test1
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "ja_JP.eucjp"
+    b text COLLATE "ja_JP.eucjp%icu"
 );
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "foo"
+    b text COLLATE "foo%icu"
 );
 
 CREATE TABLE collate_test_fail (
-    a int COLLATE "en_US",
+    a int COLLATE "en%icu",
     b text
 );
 
@@ -37,7 +38,7 @@ CREATE TABLE collate_test_like (
 
 CREATE TABLE collate_test2 (
     a int,
-    b text COLLATE "sv_SE"
+    b text COLLATE "sv%icu"
 );
 
 CREATE TABLE collate_test3 (
@@ -57,11 +58,11 @@ CREATE TABLE collate_test3 (
 SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
 SELECT * FROM collate_test1 WHERE b >= 'bbc' COLLATE "C";
 SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
-SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US";
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en%icu";
 
 
-CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE";
-CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE"; -- fails
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv%icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv%icu"; -- fails
 CREATE TABLE collate_test4 (
     a int,
     b testdomain_sv
@@ -71,7 +72,7 @@ CREATE TABLE collate_test4 (
 
 CREATE TABLE collate_test5 (
     a int,
-    b testdomain_sv COLLATE "en_US"
+    b testdomain_sv COLLATE "en%icu"
 );
 INSERT INTO collate_test5 SELECT * FROM collate_test1;
 SELECT a, b FROM collate_test5 ORDER BY b;
@@ -89,15 +90,15 @@ CREATE TABLE collate_test5 (
 SELECT * FROM collate_test3 ORDER BY b;
 
 -- constant expression folding
-SELECT 'bbc' COLLATE "en_US" > 'Ã¤bc' COLLATE "en_US" AS "true";
-SELECT 'bbc' COLLATE "sv_SE" > 'Ã¤bc' COLLATE "sv_SE" AS "false";
+SELECT 'bbc' COLLATE "en%icu" > 'Ã¤bc' COLLATE "en%icu" AS "true";
+SELECT 'bbc' COLLATE "sv%icu" > 'Ã¤bc' COLLATE "sv%icu" AS "false";
 
 -- upper/lower
 
 CREATE TABLE collate_test10 (
     a int,
-    x text COLLATE "en_US",
-    y text COLLATE "tr_TR"
+    x text COLLATE "en%icu",
+    y text COLLATE "tr%icu"
 );
 
 INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
@@ -116,11 +117,11 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';
 SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
 
-SELECT 'TÃ¼rkiye' COLLATE "en_US" ILIKE '%KI%' AS "true";
-SELECT 'TÃ¼rkiye' COLLATE "tr_TR" ILIKE '%KI%' AS "false";
+SELECT 'TÃ¼rkiye' COLLATE "en%icu" ILIKE '%KI%' AS "true";
+SELECT 'TÃ¼rkiye' COLLATE "tr%icu" ILIKE '%KI%' AS "false";
 
-SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "en_US" AS "false";
-SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "tr_TR" AS "true";
+SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "en%icu" AS "false";
+SELECT 'bÄ±t' ILIKE 'BIT' COLLATE "tr%icu" AS "true";
 
 -- The following actually exercises the selectivity estimation for ILIKE.
 SELECT relname FROM pg_class WHERE relname ILIKE 'abc%';
@@ -134,11 +135,30 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ~* '^abc';
 SELECT * FROM collate_test1 WHERE b ~* 'bc';
 
-SELECT 'TÃ¼rkiye' COLLATE "en_US" ~* 'KI' AS "true";
-SELECT 'TÃ¼rkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
-
-SELECT 'bÄ±t' ~* 'BIT' COLLATE "en_US" AS "false";
-SELECT 'bÄ±t' ~* 'BIT' COLLATE "tr_TR" AS "true";
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en%icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'Ã¤bÃ§'), (10, 'ÃBÃ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+
+SELECT 'TÃ¼rkiye' COLLATE "en%icu" ~* 'KI' AS "true";
+SELECT 'TÃ¼rkiye' COLLATE "tr%icu" ~* 'KI' AS "true";  -- true with ICU
+
+SELECT 'bÄ±t' ~* 'BIT' COLLATE "en%icu" AS "false";
+SELECT 'bÄ±t' ~* 'BIT' COLLATE "tr%icu" AS "false";  -- false with ICU
 
 -- The following actually exercises the selectivity estimation for ~*.
 SELECT relname FROM pg_class WHERE relname ~* '^abc';
@@ -148,7 +168,7 @@ CREATE TABLE collate_test10 (
 
 SET lc_time TO 'tr_TR';
 SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
-SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr%icu");
 
 
 -- backwards parsing
@@ -218,9 +238,9 @@ CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM
 
 -- collation mismatch between recursive and non-recursive term
 WITH RECURSIVE foo(x) AS
-   (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+   (SELECT x FROM (VALUES('a' COLLATE "en%icu"),('b')) t(x)
    UNION ALL
-   SELECT (x || 'c') COLLATE "de_DE" FROM foo WHERE length(x) < 10)
+   SELECT (x || 'c') COLLATE "de%icu" FROM foo WHERE length(x) < 10)
 SELECT * FROM foo;
 
 
@@ -268,7 +288,7 @@ CREATE FUNCTION mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
 end
 $$;
 
-SELECT mylt2('a', 'B' collate "en_US") as t, mylt2('a', 'B' collate "C") as f;
+SELECT mylt2('a', 'B' collate "en%icu") as t, mylt2('a', 'B' collate "C") as f;
 
 CREATE OR REPLACE FUNCTION
   mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
@@ -320,23 +340,21 @@ CREATE SCHEMA test_schema;
 -- We need to do this this way to cope with varying names for encodings:
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test0 (locale = ' ||
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
           quote_literal(current_setting('lc_collate')) || ');';
 END
 $$;
 CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
-CREATE COLLATION IF NOT EXISTS test0 FROM "C"; -- ok, skipped
-CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
           quote_literal(current_setting('lc_collate')) ||
           ', lc_ctype = ' ||
           quote_literal(current_setting('lc_ctype')) || ');';
 END
 $$;
-CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
-CREATE COLLATION testx (locale = 'nonsense'); -- fail
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails with ICU */  DROP COLLATION testx;
 
 CREATE COLLATION test4 FROM nonsense;
 CREATE COLLATION test5 FROM test0;
@@ -368,6 +386,11 @@ CREATE COLLATION test5 FROM test0;
 DROP ROLE regress_test_role;
 
 
+-- ALTER
+
+ALTER COLLATION "en%icu" REFRESH VERSION;
+
+
 -- dependencies
 
 CREATE COLLATION test0 FROM "C";
@@ -391,10 +414,14 @@ CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
 -- test range types and collations
 
 create type textrange_c as range(subtype=text, collation="C");
-create type textrange_en_us as range(subtype=text, collation="en_US");
+create type textrange_en_us as range(subtype=text, collation="en%icu");
 
 select textrange_c('A','Z') @> 'b'::text;
 select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+
+
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.linux.utf8.sql
index c349cbde2b..78c7ac2e44 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.linux.utf8.sql
@@ -6,6 +6,9 @@
 
 SET client_encoding TO UTF8;
 
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
+
 
 CREATE TABLE collate_test1 (
     a int,
@@ -134,6 +137,25 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ~* '^abc';
 SELECT * FROM collate_test1 WHERE b ~* 'bc';
 
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'Ã¤bÃ§'), (10, 'ÃBÃ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+
 SELECT 'TÃ¼rkiye' COLLATE "en_US" ~* 'KI' AS "true";
 SELECT 'TÃ¼rkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
 
@@ -368,6 +390,11 @@ CREATE COLLATION test5 FROM test0;
 DROP ROLE regress_test_role;
 
 
+-- ALTER
+
+ALTER COLLATION "en_US" REFRESH VERSION;
+
+
 -- dependencies
 
 CREATE COLLATION test0 FROM "C";
@@ -398,3 +425,7 @@ CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
 
 drop type textrange_c;
 drop type textrange_en_us;
+
+
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
-- 
2.11.1

#61

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Thomas Munro (#53)

Re: ICU integration

On 1/31/17 16:59, Thomas Munro wrote:

I assume (but haven't checked) that ucol_nextSortKeyPart accesses only
the start of the string via the UCharIterator passed in, unless you
have the rare reverse-accent-sort feature enabled (maybe used only in
fr_CA, it looks like it is required to scan the whole string looking
for the last accent). So I assume that uiter_setUTF8 and
ucol_nextSortKeyPart would usually do a small fixed amount of work,
whereas this patch's icu_to_uchar allocates space and converts the
whole variable length string every time.

I have changed it to use ucol_nextSortKeyPart() when appropriate.

That's about abbreviation, but I note that you can also compare
strings using iterators with ucol_strcollIter, avoiding the need to
allocate and transcode up front. I have no idea whether that'd pay
off.

These iterators don't actually transcode on the fly. You can only set
up iterators for UTF-8 or UTF-16 strings. And for UTF-8, we already
have a special code path using ucol_strcollUTF8(), which uses an
interator internally.

Clearly when you upgrade your system from (say) Debian 8 to Debian 9
and the ICU major version changes we expect to have to REINDEX, but
does anyone know if such data changes are ever pulled into the minor
version package upgrades you get from regular apt-get update of (say)
a Debian 8 or CentOS 7 or FreeBSD 11 system? In other words, do we
expect collversion changes to happen basically any time in response to
regular system updates, or only when you're doing a major upgrade of
some kind, as the above-quoted documentation patch implies?

Stable operating systems shouldn't do major library upgrades, so I feel
pretty confident about this.

+ collversion = ntohl(*((uint32 *) versioninfo));

UVersionInfo is an array of four uint8_t. That doesn't sound like
something that needs to be bit-swizzled... is it really? Even if it
is arranged differently on different architectures, I'm not sure why
you care since we only ever use it to compare for equality on the same
system. But aside from that, I don't love this cast to uint32. I
wonder if we should use u_versionToString to respect boundaries a bit
better?

Makes sense, changed the column type to text.

I have another motivation for wanting to model collation versions as
strings: I have been contemplating a version check for system-provided
collations too, piggy-backing on your work here. Obviously PostgreSQL
can't directly know anything about system collation versions, but I'm
thinking of a GUC system_collation_version_command which you could
optionally set to a script that knows how to inspect the local
operating system. For example, a package maintainer might be
interested in writing such a script that knows how to ask the package
system for a locale data version. A brute force approach that works
on many systems could be 'tar c /usr/share/local/*/LC_COLLATE | md5'.

That sounds like an idea worth pursuing.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#62

Peter Geoghegan

pg@bowt.ie

almost 9 years ago

In reply to: Peter Eisentraut (#61)

Re: ICU integration

On Wed, Feb 15, 2017 at 9:17 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

I have changed it to use ucol_nextSortKeyPart() when appropriate.

I think it would be fine if the second last argument was
"sizeof(Datum)", not "Min(sizeof(Datum), sss->buflen2)" here:

+               if (GetDatabaseEncoding() == PG_UTF8)
+               {
+                   UCharIterator iter;
+                   uint32_t    state[2];
+                   UErrorCode  status;
+
+                   uiter_setUTF8(&iter, sss->buf1, len);
+                   state[0] = state[1] = 0;  /* won't need that again */
+                   status = U_ZERO_ERROR;
+                   bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol,
+                                                &iter,
+                                                state,
+                                                (uint8_t *) sss->buf2,
+                                                Min(sizeof(Datum), sss->buflen2),
+                                                &status);
+                   if (U_FAILURE(status))
+                       ereport(ERROR,
+                               (errmsg("sort key generation failed: %s", u_errorName(status))));
+               }

Does this really need to be in the strxfrm() loop at all? I don't see
why you need to ever retry here.

There is an issue of style here:

-       /* Just like strcoll(), strxfrm() expects a NUL-terminated string */
memcpy(sss->buf1, authoritative_data, len);
+       /* Just like strcoll(), strxfrm() expects a NUL-terminated string.
+        * Not necessary for ICU, but doesn't hurt. */

I see that you have preserved strcoll() comparison caching (or should
I say ucol_strcollUTF8() comparison caching?), at the cost of having
to keep around a buffer which we must continue to copy every text
string into within varstr_abbrev_convert(). That was probably the
right trade-off. It doesn't hurt that it required minimal divergence
within new ICU code, too.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#63

Peter Geoghegan

pg@bowt.ie

almost 9 years ago

In reply to: Peter Eisentraut (#61)

Re: ICU integration

On Wed, Feb 15, 2017 at 9:17 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Clearly when you upgrade your system from (say) Debian 8 to Debian 9
and the ICU major version changes we expect to have to REINDEX, but
does anyone know if such data changes are ever pulled into the minor
version package upgrades you get from regular apt-get update of (say)
a Debian 8 or CentOS 7 or FreeBSD 11 system? In other words, do we
expect collversion changes to happen basically any time in response to
regular system updates, or only when you're doing a major upgrade of
some kind, as the above-quoted documentation patch implies?

Stable operating systems shouldn't do major library upgrades, so I feel
pretty confident about this.

There is one case where it *is* appropriate for a bugfix release of
ICU to bump collator version: When there was a bug in the collator
itself, leading to broken binary blobs [1]http://userguide.icu-project.org/collation/api#TOC-Sort-Key-Features -- Peter Geoghegan. We should make the user
REINDEX when this happens.

I think that ICU may well do this in point releases, which is actually
a good thing. So, it seems like a good idea to test that there is no
change, in a place like check_strxfrm_bug(). In my opinion, we should
LOG a complaint in the event of a mismatch rather than refusing to
start the server, since there probably isn't that much chance of the
user being harmed by the bug. The cost of not starting the server
properly until a REINDEX finishes or similar looks particularly
unappealing when one considers that the app was probably affected by
any corruption for weeks or months before the ICU update enabled its
detection.

[1]: http://userguide.icu-project.org/collation/api#TOC-Sort-Key-Features -- Peter Geoghegan
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#64

Alvaro Herrera

alvherre@2ndquadrant.com

almost 9 years ago

In reply to: Peter Eisentraut (#60)

Re: ICU integration

Peter Eisentraut wrote:

- Naming of collations: Are we happy with the "de%icu" naming? I might
have come up with that while reviewing the IPv6 zone index patch. ;-)
An alternative might be "de$icu" for more Oracle vibe and avoiding the
need for double quotes in some cases. (But we have mixed-case names
like "de_AT%icu", so further changes would be necessary to fully get rid
of the need for quoting.) A more radical alternative would be to
install ICU locales in a different schema and use the search_path, or
even have a separate search path setting for collations only. Which
leads to ...

- Selecting default collation provider: Maybe we want a setting, say in
initdb, to determine which provider's collations get the "good" names?
Maybe not necessary for this release, but something to think about.

I'm not sure I like "default collations" to depend on either search_path
or some hypothetical future setting. That seems to lead to unproductive
discussions on mailing list questions,

User: hey, my sort doesn't work sanely
Pg person: okay, but what collation are you using? Look for the collate
clause
User: Ah, it's de_DE
Pg person: oh, okay, but is it ICU's de_DE or the libc's?
User: ...

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#65

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Peter Geoghegan (#62)

Re: ICU integration

On 2/16/17 01:13, Peter Geoghegan wrote:

On Wed, Feb 15, 2017 at 9:17 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

I have changed it to use ucol_nextSortKeyPart() when appropriate.

I think it would be fine if the second last argument was
"sizeof(Datum)", not "Min(sizeof(Datum), sss->buflen2)" here:

I don't see anything that guarantees that the buffer length is always at
least sizeof(Datum).

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#66

Bruce Momjian

bruce@momjian.us

almost 9 years ago

In reply to: Peter Geoghegan (#63)

Re: ICU integration

On Wed, Feb 15, 2017 at 10:35:34PM -0800, Peter Geoghegan wrote:

On Wed, Feb 15, 2017 at 9:17 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Stable operating systems shouldn't do major library upgrades, so I feel
pretty confident about this.

There is one case where it *is* appropriate for a bugfix release of
ICU to bump collator version: When there was a bug in the collator
itself, leading to broken binary blobs [1]. We should make the user
REINDEX when this happens.

I think that ICU may well do this in point releases, which is actually
a good thing. So, it seems like a good idea to test that there is no
change, in a place like check_strxfrm_bug(). In my opinion, we should
LOG a complaint in the event of a mismatch rather than refusing to
start the server, since there probably isn't that much chance of the
user being harmed by the bug. The cost of not starting the server
properly until a REINDEX finishes or similar looks particularly
unappealing when one considers that the app was probably affected by
any corruption for weeks or months before the ICU update enabled its
detection.
http://www.postgresql.org/mailpref/pgsql-hackers

I can't think of any cases where we warn of possible corruption only in
the server logs.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#67

Peter Geoghegan

pg@bowt.ie

almost 9 years ago

In reply to: Bruce Momjian (#66)

Re: ICU integration

On Mon, Feb 20, 2017 at 2:38 PM, Bruce Momjian <bruce@momjian.us> wrote:

I can't think of any cases where we warn of possible corruption only in
the server logs.

It's just like any case where there was a bug in Postgres that could
have subtly broken index builds. We don't make the next point release
force users to REINDEX every possibly-affected index every time that
happens. Presumably this is because they aren't particularly likely to
be affected by any given bug, and because it's pretty infeasible for
us to do so anyway. There aren't always easy ways to detect the
problem, and users will never learn to deal with issues like this well
when it is by definition something that should never happen.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#68

Bruce Momjian

bruce@momjian.us

almost 9 years ago

In reply to: Peter Geoghegan (#67)

Re: ICU integration

On Mon, Feb 20, 2017 at 02:54:22PM -0800, Peter Geoghegan wrote:

On Mon, Feb 20, 2017 at 2:38 PM, Bruce Momjian <bruce@momjian.us> wrote:

I can't think of any cases where we warn of possible corruption only in
the server logs.

It's just like any case where there was a bug in Postgres that could
have subtly broken index builds. We don't make the next point release
force users to REINDEX every possibly-affected index every time that
happens. Presumably this is because they aren't particularly likely to
be affected by any given bug, and because it's pretty infeasible for
us to do so anyway. There aren't always easy ways to detect the
problem, and users will never learn to deal with issues like this well
when it is by definition something that should never happen.

Well, the release notes are a clear-enough communication for a need to
reindex. I don't see a LOG message as similar. Don't we have other
cases where we have to warn administrators? We can mark the indexes as
invalid but how would we report that?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#69

Peter Geoghegan

pg@bowt.ie

almost 9 years ago

In reply to: Bruce Momjian (#68)

Re: ICU integration

On Mon, Feb 20, 2017 at 3:19 PM, Bruce Momjian <bruce@momjian.us> wrote:

Well, the release notes are a clear-enough communication for a need to
reindex. I don't see a LOG message as similar. Don't we have other
cases where we have to warn administrators? We can mark the indexes as
invalid but how would we report that?

Marking all indexes as invalid would be an enormous overkill. We don't
even know for sure that the values the user has indexed happens to be
affected. In order for there to have been a bug in ICU in the first
place, the likelihood is that it only occurs in what are edge cases
for the collator.

ICU is a very popular library, used in software that I personally
interact with every day [1]http://site.icu-project.org/#TOC-Who-Uses-ICU- -- Peter Geoghegan. Any bugs like this should be exceptional.
They very clearly appreciate how sensitive software like Postgres is
to changes like this, which is why the versioning API exists.

[1]: http://site.icu-project.org/#TOC-Who-Uses-ICU- -- Peter Geoghegan
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#70

Bruce Momjian

bruce@momjian.us

almost 9 years ago

In reply to: Peter Geoghegan (#69)

Re: ICU integration

On Mon, Feb 20, 2017 at 03:29:07PM -0800, Peter Geoghegan wrote:

Marking all indexes as invalid would be an enormous overkill. We don't
even know for sure that the values the user has indexed happens to be
affected. In order for there to have been a bug in ICU in the first
place, the likelihood is that it only occurs in what are edge cases
for the collator.

ICU is a very popular library, used in software that I personally
interact with every day [1]. Any bugs like this should be exceptional.
They very clearly appreciate how sensitive software like Postgres is
to changes like this, which is why the versioning API exists.

[1] http://site.icu-project.org/#TOC-Who-Uses-ICU-

So we don't have any other cases where we warn about possible corruption
except this?

Also, I will go back to my previous concern, that while I like the fact
we can detect collation changes with ICU, we don't know if ICU
collations change more often than OS collations.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#71

Peter Geoghegan

pg@bowt.ie

almost 9 years ago

In reply to: Bruce Momjian (#70)

Re: ICU integration

On Mon, Feb 20, 2017 at 3:51 PM, Bruce Momjian <bruce@momjian.us> wrote:

So we don't have any other cases where we warn about possible corruption
except this?

I'm not sure that I understand the distinction you're making.

Also, I will go back to my previous concern, that while I like the fact
we can detect collation changes with ICU, we don't know if ICU
collations change more often than OS collations.

We do know that ICU collations can never change behaviorally in a
given stable release. Bug fixes are allowed in point releases, but
these never change the user-visible behavior of collations. That's
very clear, because an upstream Unciode UCA version is used by a given
major release of ICU. This upstream data describes the behavior of a
collation using a high-level declarative language, that non-programmer
experts in natural languages write.

ICU versions many different things, in fact. Importantly, it
explicitly decouples behavioral issues (user visible sort order -- UCA
version) from technical issues (collator implementation details). So,
my original point is that that could change, and if that happens we
ought to have a plan. But, it won't change unless it really has to.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#72

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Peter Eisentraut (#60)

1 attachment(s)

Re: ICU integration

The v4 patch didn't apply anymore, so here is a rebased patch. No
changes in functionality.

On 2/16/17 00:10, Peter Eisentraut wrote:

Updated and rebased patch.

Significant changes:

- Changed collversion to type text

- Changed pg_locale_t to a union

- Use ucol_getAvailable() instead of uloc_getAvailable(), so the set of
initial collations is smaller now, because redundancies are eliminated.

- Added keyword variants to predefined ICU collations (so you get
"de_phonebook%icu", for example) (So the initial set of collations is
bigger now. :) )

- Predefined ICU collations have a comment now, so \dOS+ is useful.

- Use ucol_nextSortKeyPart() for abbreviated keys

- Enhanced tests and documentation

I believe all issues raised in reviews have been addressed.

Discussion points:

- Naming of collations: Are we happy with the "de%icu" naming? I might
have come up with that while reviewing the IPv6 zone index patch. ;-)
An alternative might be "de$icu" for more Oracle vibe and avoiding the
need for double quotes in some cases. (But we have mixed-case names
like "de_AT%icu", so further changes would be necessary to fully get rid
of the need for quoting.) A more radical alternative would be to
install ICU locales in a different schema and use the search_path, or
even have a separate search path setting for collations only. Which
leads to ...

- Selecting default collation provider: Maybe we want a setting, say in
initdb, to determine which provider's collations get the "good" names?
Maybe not necessary for this release, but something to think about.

- Currently (in this patch), we check a collation's version when it is
first used. But, say, after pg_upgrade, you might want to check all of
them right away. What might be a good interface for that? (Possibly,
we only have to check the ones actually in use, and we have dependency
information for that.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v5-0001-ICU-support.patchapplication/x-patch; name=v5-0001-ICU-support.patchDownload

From 00f3fb3dcd038d944e6ebe3c026fa56e242d81d3 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Thu, 16 Feb 2017 00:06:16 -0500
Subject: [PATCH v5] ICU support

Add a column collprovider to pg_collation that determines which library
provides the collation data.  The existing choices are default and libc,
and this adds an icu choice, which uses the ICU4C library.

The pg_locale_t type is changed to a union that contains the
provider-specific locale handles.  Users of locale information are
changed to look into that struct for the appropriate handle to use.

Also add a collversion column that records the version of the collation
when it is created, and check at run time whether it is still the same.
This detects potentially incompatible library upgrades that can corrupt
indexes and other structures.  This is currently only supported by
ICU-provided collations.

initdb initializes the default collation set as before from the `locale
-a` output but also adds all available ICU locales with a "%icu"
appended.

Currently, ICU-provided collations can only be explicitly named
collations.  The global database locales are still always libc-provided.
---
 aclocal.m4                                         |   1 +
 config/pkg.m4                                      | 275 +++++++++++++
 configure                                          | 313 ++++++++++++++
 configure.in                                       |  35 ++
 doc/src/sgml/catalogs.sgml                         |  19 +
 doc/src/sgml/charset.sgml                          | 172 +++++++-
 doc/src/sgml/installation.sgml                     |  14 +
 doc/src/sgml/ref/alter_collation.sgml              |  42 ++
 doc/src/sgml/ref/create_collation.sgml             |  16 +-
 src/Makefile.global.in                             |   4 +
 src/backend/Makefile                               |   2 +-
 src/backend/catalog/pg_collation.c                 |  26 +-
 src/backend/commands/collationcmds.c               | 205 +++++++++-
 src/backend/common.mk                              |   2 +
 src/backend/nodes/copyfuncs.c                      |  13 +
 src/backend/nodes/equalfuncs.c                     |  11 +
 src/backend/parser/gram.y                          |  18 +-
 src/backend/regex/regc_pg_locale.c                 | 110 +++--
 src/backend/tcop/utility.c                         |   8 +
 src/backend/utils/adt/formatting.c                 | 453 +++++++++++----------
 src/backend/utils/adt/like.c                       |  53 ++-
 src/backend/utils/adt/pg_locale.c                  | 259 ++++++++++--
 src/backend/utils/adt/selfuncs.c                   |   8 +-
 src/backend/utils/adt/varlena.c                    | 155 ++++++-
 src/backend/utils/mb/encnames.c                    |  76 ++++
 src/bin/initdb/initdb.c                            |   3 +-
 src/bin/pg_dump/pg_dump.c                          |  55 ++-
 src/bin/psql/describe.c                            |   7 +-
 src/include/catalog/pg_collation.h                 |  25 +-
 src/include/catalog/pg_collation_fn.h              |   1 +
 src/include/commands/collationcmds.h               |   1 +
 src/include/mb/pg_wchar.h                          |   6 +
 src/include/nodes/nodes.h                          |   1 +
 src/include/nodes/parsenodes.h                     |  11 +
 src/include/pg_config.h.in                         |   6 +
 src/include/utils/pg_locale.h                      |  32 +-
 src/test/regress/GNUmakefile                       |   3 +
 .../{collate.linux.utf8.out => collate.icu.out}    | 188 +++++----
 src/test/regress/expected/collate.linux.utf8.out   |  78 +++-
 .../{collate.linux.utf8.sql => collate.icu.sql}    |  99 +++--
 src/test/regress/sql/collate.linux.utf8.sql        |  31 ++
 41 files changed, 2344 insertions(+), 493 deletions(-)
 create mode 100644 config/pkg.m4
 copy src/test/regress/expected/{collate.linux.utf8.out => collate.icu.out} (80%)
 copy src/test/regress/sql/{collate.linux.utf8.sql => collate.icu.sql} (82%)

diff --git a/aclocal.m4 b/aclocal.m4
index 6f930b6fc1..5ca902b6a2 100644
--- a/aclocal.m4
+++ b/aclocal.m4
@@ -7,6 +7,7 @@ m4_include([config/docbook.m4])
 m4_include([config/general.m4])
 m4_include([config/libtool.m4])
 m4_include([config/perl.m4])
+m4_include([config/pkg.m4])
 m4_include([config/programs.m4])
 m4_include([config/python.m4])
 m4_include([config/tcl.m4])
diff --git a/config/pkg.m4 b/config/pkg.m4
new file mode 100644
index 0000000000..82bea96ee7
--- /dev/null
+++ b/config/pkg.m4
@@ -0,0 +1,275 @@
+dnl pkg.m4 - Macros to locate and utilise pkg-config.   -*- Autoconf -*-
+dnl serial 11 (pkg-config-0.29.1)
+dnl
+dnl Copyright © 2004 Scott James Remnant <scott@netsplit.com>.
+dnl Copyright © 2012-2015 Dan Nicholson <dbn.lists@gmail.com>
+dnl
+dnl This program is free software; you can redistribute it and/or modify
+dnl it under the terms of the GNU General Public License as published by
+dnl the Free Software Foundation; either version 2 of the License, or
+dnl (at your option) any later version.
+dnl
+dnl This program is distributed in the hope that it will be useful, but
+dnl WITHOUT ANY WARRANTY; without even the implied warranty of
+dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+dnl General Public License for more details.
+dnl
+dnl You should have received a copy of the GNU General Public License
+dnl along with this program; if not, write to the Free Software
+dnl Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+dnl 02111-1307, USA.
+dnl
+dnl As a special exception to the GNU General Public License, if you
+dnl distribute this file as part of a program that contains a
+dnl configuration script generated by Autoconf, you may include it under
+dnl the same distribution terms that you use for the rest of that
+dnl program.
+
+dnl PKG_PREREQ(MIN-VERSION)
+dnl -----------------------
+dnl Since: 0.29
+dnl
+dnl Verify that the version of the pkg-config macros are at least
+dnl MIN-VERSION. Unlike PKG_PROG_PKG_CONFIG, which checks the user's
+dnl installed version of pkg-config, this checks the developer's version
+dnl of pkg.m4 when generating configure.
+dnl
+dnl To ensure that this macro is defined, also add:
+dnl m4_ifndef([PKG_PREREQ],
+dnl     [m4_fatal([must install pkg-config 0.29 or later before running autoconf/autogen])])
+dnl
+dnl See the "Since" comment for each macro you use to see what version
+dnl of the macros you require.
+m4_defun([PKG_PREREQ],
+[m4_define([PKG_MACROS_VERSION], [0.29.1])
+m4_if(m4_version_compare(PKG_MACROS_VERSION, [$1]), -1,
+    [m4_fatal([pkg.m4 version $1 or higher is required but ]PKG_MACROS_VERSION[ found])])
+])dnl PKG_PREREQ
+
+dnl PKG_PROG_PKG_CONFIG([MIN-VERSION])
+dnl ----------------------------------
+dnl Since: 0.16
+dnl
+dnl Search for the pkg-config tool and set the PKG_CONFIG variable to
+dnl first found in the path. Checks that the version of pkg-config found
+dnl is at least MIN-VERSION. If MIN-VERSION is not specified, 0.9.0 is
+dnl used since that's the first version where most current features of
+dnl pkg-config existed.
+AC_DEFUN([PKG_PROG_PKG_CONFIG],
+[m4_pattern_forbid([^_?PKG_[A-Z_]+$])
+m4_pattern_allow([^PKG_CONFIG(_(PATH|LIBDIR|SYSROOT_DIR|ALLOW_SYSTEM_(CFLAGS|LIBS)))?$])
+m4_pattern_allow([^PKG_CONFIG_(DISABLE_UNINSTALLED|TOP_BUILD_DIR|DEBUG_SPEW)$])
+AC_ARG_VAR([PKG_CONFIG], [path to pkg-config utility])
+AC_ARG_VAR([PKG_CONFIG_PATH], [directories to add to pkg-config's search path])
+AC_ARG_VAR([PKG_CONFIG_LIBDIR], [path overriding pkg-config's built-in search path])
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	AC_PATH_TOOL([PKG_CONFIG], [pkg-config])
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=m4_default([$1], [0.9.0])
+	AC_MSG_CHECKING([pkg-config is at least version $_pkg_min_version])
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		AC_MSG_RESULT([yes])
+	else
+		AC_MSG_RESULT([no])
+		PKG_CONFIG=""
+	fi
+fi[]dnl
+])dnl PKG_PROG_PKG_CONFIG
+
+dnl PKG_CHECK_EXISTS(MODULES, [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------------------------------
+dnl Since: 0.18
+dnl
+dnl Check to see whether a particular set of modules exists. Similar to
+dnl PKG_CHECK_MODULES(), but does not set variables or print errors.
+dnl
+dnl Please remember that m4 expands AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+dnl only at the first occurence in configure.ac, so if the first place
+dnl it's called might be skipped (such as if it is within an "if", you
+dnl have to call PKG_CHECK_EXISTS manually
+AC_DEFUN([PKG_CHECK_EXISTS],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+if test -n "$PKG_CONFIG" && \
+    AC_RUN_LOG([$PKG_CONFIG --exists --print-errors "$1"]); then
+  m4_default([$2], [:])
+m4_ifvaln([$3], [else
+  $3])dnl
+fi])
+
+dnl _PKG_CONFIG([VARIABLE], [COMMAND], [MODULES])
+dnl ---------------------------------------------
+dnl Internal wrapper calling pkg-config via PKG_CONFIG and setting
+dnl pkg_failed based on the result.
+m4_define([_PKG_CONFIG],
+[if test -n "$$1"; then
+    pkg_cv_[]$1="$$1"
+ elif test -n "$PKG_CONFIG"; then
+    PKG_CHECK_EXISTS([$3],
+                     [pkg_cv_[]$1=`$PKG_CONFIG --[]$2 "$3" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes ],
+		     [pkg_failed=yes])
+ else
+    pkg_failed=untried
+fi[]dnl
+])dnl _PKG_CONFIG
+
+dnl _PKG_SHORT_ERRORS_SUPPORTED
+dnl ---------------------------
+dnl Internal check to see if pkg-config supports short errors.
+AC_DEFUN([_PKG_SHORT_ERRORS_SUPPORTED],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi[]dnl
+])dnl _PKG_SHORT_ERRORS_SUPPORTED
+
+
+dnl PKG_CHECK_MODULES(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl --------------------------------------------------------------
+dnl Since: 0.4.0
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES might not happen, you should be sure to include an
+dnl explicit call to PKG_PROG_PKG_CONFIG in your configure.ac
+AC_DEFUN([PKG_CHECK_MODULES],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1][_CFLAGS], [C compiler flags for $1, overriding pkg-config])dnl
+AC_ARG_VAR([$1][_LIBS], [linker flags for $1, overriding pkg-config])dnl
+
+pkg_failed=no
+AC_MSG_CHECKING([for $1])
+
+_PKG_CONFIG([$1][_CFLAGS], [cflags], [$2])
+_PKG_CONFIG([$1][_LIBS], [libs], [$2])
+
+m4_define([_PKG_TEXT], [Alternatively, you may set the environment variables $1[]_CFLAGS
+and $1[]_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.])
+
+if test $pkg_failed = yes; then
+   	AC_MSG_RESULT([no])
+        _PKG_SHORT_ERRORS_SUPPORTED
+        if test $_pkg_short_errors_supported = yes; then
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "$2" 2>&1`
+        else 
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "$2" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$$1[]_PKG_ERRORS" >&AS_MESSAGE_LOG_FD
+
+	m4_default([$4], [AC_MSG_ERROR(
+[Package requirements ($2) were not met:
+
+$$1_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+_PKG_TEXT])[]dnl
+        ])
+elif test $pkg_failed = untried; then
+     	AC_MSG_RESULT([no])
+	m4_default([$4], [AC_MSG_FAILURE(
+[The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+_PKG_TEXT
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.])[]dnl
+        ])
+else
+	$1[]_CFLAGS=$pkg_cv_[]$1[]_CFLAGS
+	$1[]_LIBS=$pkg_cv_[]$1[]_LIBS
+        AC_MSG_RESULT([yes])
+	$3
+fi[]dnl
+])dnl PKG_CHECK_MODULES
+
+
+dnl PKG_CHECK_MODULES_STATIC(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl ---------------------------------------------------------------------
+dnl Since: 0.29
+dnl
+dnl Checks for existence of MODULES and gathers its build flags with
+dnl static libraries enabled. Sets VARIABLE-PREFIX_CFLAGS from --cflags
+dnl and VARIABLE-PREFIX_LIBS from --libs.
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES_STATIC might not happen, you should be sure to
+dnl include an explicit call to PKG_PROG_PKG_CONFIG in your
+dnl configure.ac.
+AC_DEFUN([PKG_CHECK_MODULES_STATIC],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+_save_PKG_CONFIG=$PKG_CONFIG
+PKG_CONFIG="$PKG_CONFIG --static"
+PKG_CHECK_MODULES($@)
+PKG_CONFIG=$_save_PKG_CONFIG[]dnl
+])dnl PKG_CHECK_MODULES_STATIC
+
+
+dnl PKG_INSTALLDIR([DIRECTORY])
+dnl -------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable pkgconfigdir as the location where a module
+dnl should install pkg-config .pc files. By default the directory is
+dnl $libdir/pkgconfig, but the default can be changed by passing
+dnl DIRECTORY. The user can override through the --with-pkgconfigdir
+dnl parameter.
+AC_DEFUN([PKG_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${libdir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([pkgconfigdir],
+    [AS_HELP_STRING([--with-pkgconfigdir], pkg_description)],,
+    [with_pkgconfigdir=]pkg_default)
+AC_SUBST([pkgconfigdir], [$with_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_INSTALLDIR
+
+
+dnl PKG_NOARCH_INSTALLDIR([DIRECTORY])
+dnl --------------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable noarch_pkgconfigdir as the location where a
+dnl module should install arch-independent pkg-config .pc files. By
+dnl default the directory is $datadir/pkgconfig, but the default can be
+dnl changed by passing DIRECTORY. The user can override through the
+dnl --with-noarch-pkgconfigdir parameter.
+AC_DEFUN([PKG_NOARCH_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${datadir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config arch-independent installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([noarch-pkgconfigdir],
+    [AS_HELP_STRING([--with-noarch-pkgconfigdir], pkg_description)],,
+    [with_noarch_pkgconfigdir=]pkg_default)
+AC_SUBST([noarch_pkgconfigdir], [$with_noarch_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_NOARCH_INSTALLDIR
+
+
+dnl PKG_CHECK_VAR(VARIABLE, MODULE, CONFIG-VARIABLE,
+dnl [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------
+dnl Since: 0.28
+dnl
+dnl Retrieves the value of the pkg-config variable for the given module.
+AC_DEFUN([PKG_CHECK_VAR],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1], [value of $3 for $2, overriding pkg-config])dnl
+
+_PKG_CONFIG([$1], [variable="][$3]["], [$2])
+AS_VAR_COPY([$1], [pkg_cv_][$1])
+
+AS_VAR_IF([$1], [""], [$5], [$4])dnl
+])dnl PKG_CHECK_VAR
diff --git a/configure b/configure
index b5cdebb510..481d5d15e8 100755
--- a/configure
+++ b/configure
@@ -715,6 +715,12 @@ krb_srvtab
 with_python
 with_perl
 with_tcl
+ICU_LIBS
+ICU_CFLAGS
+PKG_CONFIG_LIBDIR
+PKG_CONFIG_PATH
+PKG_CONFIG
+with_icu
 enable_thread_safety
 INCLUDES
 autodepend
@@ -821,6 +827,7 @@ with_CC
 enable_depend
 enable_cassert
 enable_thread_safety
+with_icu
 with_tcl
 with_tclconfig
 with_perl
@@ -856,6 +863,11 @@ LDFLAGS
 LIBS
 CPPFLAGS
 CPP
+PKG_CONFIG
+PKG_CONFIG_PATH
+PKG_CONFIG_LIBDIR
+ICU_CFLAGS
+ICU_LIBS
 LDFLAGS_EX
 LDFLAGS_SL
 DOCBOOKSTYLE'
@@ -1511,6 +1523,7 @@ Optional Packages:
   --with-wal-segsize=SEGSIZE
                           set WAL segment size in MB [16]
   --with-CC=CMD           set compiler (deprecated)
+  --with-icu              build with ICU support
   --with-tcl              build Tcl modules (PL/Tcl)
   --with-tclconfig=DIR    tclConfig.sh is in DIR
   --with-perl             build Perl modules (PL/Perl)
@@ -1546,6 +1559,13 @@ Some influential environment variables:
   CPPFLAGS    (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if
               you have headers in a nonstandard directory <include dir>
   CPP         C preprocessor
+  PKG_CONFIG  path to pkg-config utility
+  PKG_CONFIG_PATH
+              directories to add to pkg-config's search path
+  PKG_CONFIG_LIBDIR
+              path overriding pkg-config's built-in search path
+  ICU_CFLAGS  C compiler flags for ICU, overriding pkg-config
+  ICU_LIBS    linker flags for ICU, overriding pkg-config
   LDFLAGS_EX  extra linker flags for linking executables only
   LDFLAGS_SL  extra linker flags for linking shared libraries only
   DOCBOOKSTYLE
@@ -5362,6 +5382,255 @@ $as_echo "$enable_thread_safety" >&6; }
 
 
 #
+# ICU
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with ICU support" >&5
+$as_echo_n "checking whether to build with ICU support... " >&6; }
+
+
+
+# Check whether --with-icu was given.
+if test "${with_icu+set}" = set; then :
+  withval=$with_icu;
+  case $withval in
+    yes)
+
+$as_echo "#define USE_ICU 1" >>confdefs.h
+
+      ;;
+    no)
+      :
+      ;;
+    *)
+      as_fn_error $? "no argument expected for --with-icu option" "$LINENO" 5
+      ;;
+  esac
+
+else
+  with_icu=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_icu" >&5
+$as_echo "$with_icu" >&6; }
+
+
+if test "$with_icu" = yes; then
+
+
+
+
+
+
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	if test -n "$ac_tool_prefix"; then
+  # Extract the first word of "${ac_tool_prefix}pkg-config", so it can be a program name with args.
+set dummy ${ac_tool_prefix}pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_PKG_CONFIG="$PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+PKG_CONFIG=$ac_cv_path_PKG_CONFIG
+if test -n "$PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $PKG_CONFIG" >&5
+$as_echo "$PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+fi
+if test -z "$ac_cv_path_PKG_CONFIG"; then
+  ac_pt_PKG_CONFIG=$PKG_CONFIG
+  # Extract the first word of "pkg-config", so it can be a program name with args.
+set dummy pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_ac_pt_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $ac_pt_PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_ac_pt_PKG_CONFIG="$ac_pt_PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_ac_pt_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+ac_pt_PKG_CONFIG=$ac_cv_path_ac_pt_PKG_CONFIG
+if test -n "$ac_pt_PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_pt_PKG_CONFIG" >&5
+$as_echo "$ac_pt_PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+  if test "x$ac_pt_PKG_CONFIG" = x; then
+    PKG_CONFIG=""
+  else
+    case $cross_compiling:$ac_tool_warned in
+yes:)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5
+$as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;}
+ac_tool_warned=yes ;;
+esac
+    PKG_CONFIG=$ac_pt_PKG_CONFIG
+  fi
+else
+  PKG_CONFIG="$ac_cv_path_PKG_CONFIG"
+fi
+
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=0.9.0
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: checking pkg-config is at least version $_pkg_min_version" >&5
+$as_echo_n "checking pkg-config is at least version $_pkg_min_version... " >&6; }
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+	else
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+		PKG_CONFIG=""
+	fi
+fi
+
+pkg_failed=no
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for ICU" >&5
+$as_echo_n "checking for ICU... " >&6; }
+
+if test -n "$ICU_CFLAGS"; then
+    pkg_cv_ICU_CFLAGS="$ICU_CFLAGS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_CFLAGS=`$PKG_CONFIG --cflags "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+if test -n "$ICU_LIBS"; then
+    pkg_cv_ICU_LIBS="$ICU_LIBS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_LIBS=`$PKG_CONFIG --libs "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+
+
+
+if test $pkg_failed = yes; then
+   	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi
+        if test $_pkg_short_errors_supported = yes; then
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        else
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$ICU_PKG_ERRORS" >&5
+
+	as_fn_error $? "Package requirements (icu-uc icu-i18n) were not met:
+
+$ICU_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details." "$LINENO" 5
+elif test $pkg_failed = untried; then
+     	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+	{ { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error $? "The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.
+See \`config.log' for more details" "$LINENO" 5; }
+else
+	ICU_CFLAGS=$pkg_cv_ICU_CFLAGS
+	ICU_LIBS=$pkg_cv_ICU_LIBS
+        { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+
+fi
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with Tcl" >&5
@@ -13433,6 +13702,50 @@ fi
 done
 
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ucol_strcollUTF8" >&5
+$as_echo_n "checking for ucol_strcollUTF8... " >&6; }
+if ${pgac_cv_func_ucol_strcollUTF8+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <unicode/ucol.h>
+
+int
+main ()
+{
+ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv_func_ucol_strcollUTF8=yes
+else
+  pgac_cv_func_ucol_strcollUTF8=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_func_ucol_strcollUTF8" >&5
+$as_echo "$pgac_cv_func_ucol_strcollUTF8" >&6; }
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+
+$as_echo "#define HAVE_UCOL_STRCOLLUTF8 1" >>confdefs.h
+
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/configure.in b/configure.in
index 1d99cda1d8..235b8745d6 100644
--- a/configure.in
+++ b/configure.in
@@ -614,6 +614,19 @@ AC_MSG_RESULT([$enable_thread_safety])
 AC_SUBST(enable_thread_safety)
 
 #
+# ICU
+#
+AC_MSG_CHECKING([whether to build with ICU support])
+PGAC_ARG_BOOL(with, icu, no, [build with ICU support],
+              [AC_DEFINE([USE_ICU], 1, [Define to build with ICU support. (--with-icu)])])
+AC_MSG_RESULT([$with_icu])
+AC_SUBST(with_icu)
+
+if test "$with_icu" = yes; then
+  PKG_CHECK_MODULES(ICU, icu-uc icu-i18n)
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 AC_MSG_CHECKING([whether to build with Tcl])
@@ -1634,6 +1647,28 @@ fi
 AC_CHECK_FUNCS([strtoll strtoq], [break])
 AC_CHECK_FUNCS([strtoull strtouq], [break])
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  AC_CACHE_CHECK([for ucol_strcollUTF8], [pgac_cv_func_ucol_strcollUTF8],
+[ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+AC_LINK_IFELSE([AC_LANG_PROGRAM(
+[#include <unicode/ucol.h>
+],
+[ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);])],
+[pgac_cv_func_ucol_strcollUTF8=yes],
+[pgac_cv_func_ucol_strcollUTF8=no])
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS])
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+    AC_DEFINE([HAVE_UCOL_STRCOLLUTF8], 1, [Define to 1 if you have the `ucol_strcollUTF8' function.])
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 28cdabe6fe..33ae612892 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2013,6 +2013,14 @@ <title><structname>pg_collation</> Columns</title>
      </row>
 
      <row>
+      <entry><structfield>collprovider</structfield></entry>
+      <entry><type>char</type></entry>
+      <entry></entry>
+      <entry>Provider of the collation: <literal>d</literal> = database
+       default, <literal>c</literal> = libc, <literal>i</literal> = icu</entry>
+     </row>
+
+     <row>
       <entry><structfield>collencoding</structfield></entry>
       <entry><type>int4</type></entry>
       <entry></entry>
@@ -2033,6 +2041,17 @@ <title><structname>pg_collation</> Columns</title>
       <entry></entry>
       <entry><symbol>LC_CTYPE</> for this collation object</entry>
      </row>
+
+     <row>
+      <entry><structfield>collversion</structfield></entry>
+      <entry><type>text</type></entry>
+      <entry></entry>
+      <entry>
+       Provider-specific version of the collation.  This is recorded when the
+       collation is created and then checked when it is used, to detect
+       changes in the collation definition that could lead to data corruption.
+      </entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 2aba0fc528..01a38931fa 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -500,21 +500,45 @@ <title>Concepts</title>
    <title>Managing Collations</title>
 
    <para>
-    A collation is an SQL schema object that maps an SQL name to
-    operating system locales.  In particular, it maps to a combination
-    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>.  (As
+    A collation is an SQL schema object that maps an SQL name to locales
+    provided by libraries installed in the operating system.  A collation
+    definition has a <firstterm>provider</firstterm> that specifies which
+    library supplies the locale data.  One standard provider name
+    is <literal>libc</literal>, which uses the locales provided by the
+    operating system C library.  These are the locales that most tools
+    provided by the operating system use.  Another provider
+    is <literal>icu</literal>, which uses the external
+    ICU<indexterm><primary>ICU</></> library.  Support for ICU has to be
+    configured when PostgreSQL is built.
+   </para>
+
+   <para>
+    A collation object provided by <literal>libc</literal> maps to a combination
+    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol> settings.  (As
     the name would suggest, the main purpose of a collation is to set
     <symbol>LC_COLLATE</symbol>, which controls the sort order.  But
     it is rarely necessary in practice to have an
     <symbol>LC_CTYPE</symbol> setting that is different from
     <symbol>LC_COLLATE</symbol>, so it is more convenient to collect
     these under one concept than to create another infrastructure for
-    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a collation
+    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a <literal>libc</literal> collation
     is tied to a character set encoding (see <xref linkend="multibyte">).
     The same collation name may exist for different encodings.
    </para>
 
    <para>
+    A collation provided by <literal>icu</literal> maps to a named collator
+    provided by the ICU library.  ICU does not support
+    separate <quote>collate</quote> and <quote>ctype</quote> settings, so they
+    are always the same.  Also, ICU collations are independent of the
+    encoding, so there is always only one ICU collation for a given name in a
+    database.
+   </para>
+
+   <sect3>
+    <title>Standard Collations</title>
+
+   <para>
     On all platforms, the collations named <literal>default</>,
     <literal>C</>, and <literal>POSIX</> are available.  Additional
     collations may be available depending on operating system support.
@@ -528,12 +552,36 @@ <title>Managing Collations</title>
    </para>
 
    <para>
+    Additionally, the SQL standard collation name <literal>ucs_basic</literal>
+    is available for encoding <literal>UTF8</literal>.  It is equivalent
+    to <literal>C</literal> and sorts by Unicode code point.
+   </para>
+  </sect3>
+
+  <sect3>
+   <title>Predefined Collations</title>
+
+   <para>
     If the operating system provides support for using multiple locales
     within a single program (<function>newlocale</> and related functions),
+    or support for ICU is configured,
     then when a database cluster is initialized, <command>initdb</command>
     populates the system catalog <literal>pg_collation</literal> with
     collations based on all the locales it finds on the operating
-    system at the time.  For example, the operating system might
+    system at the time.
+   </para>
+
+   <para>
+    The inspect the currently available locales, use the query <literal>SELECT
+    * FROM pg_collation</literal>, or the command <command>\dOS+</command>
+    in <application>psql</application>.
+   </para>
+
+  <sect4>
+   <title>libc collations</title>
+
+   <para>
+    For example, the operating system might
     provide a locale named <literal>de_DE.utf8</literal>.
     <command>initdb</command> would then create a collation named
     <literal>de_DE.utf8</literal> for encoding <literal>UTF8</literal>
@@ -544,17 +592,20 @@ <title>Managing Collations</title>
     under the name <literal>de_DE</literal>, which is less cumbersome
     to write and makes the name less encoding-dependent.  Note that,
     nevertheless, the initial set of collation names is
-    platform-dependent.
+    platform-dependent.  Collations provided by ICU are created
+    with <literal>%icu</literal> appended, so <literal>de%icu</literal> for
+    example.
    </para>
 
    <para>
-    In case a collation is needed that has different values for
-    <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>, a new
-    collation may be created using
-    the <xref linkend="sql-createcollation"> command.  That command
-    can also be used to create a new collation from an existing
-    collation, which can be useful to be able to use
-    operating-system-independent collation names in applications.
+    The default set of collations provided by <literal>libc</literal> map
+    directly to the locales installed in the operating system, which can be
+    listed using the command <literal>locale -a</literal>.
+    In case a <literal>libc</literal> collation is needed that has different values for
+    <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>, or new locales
+    are installed in the operating system after the database system was
+    initialized, then a new collation may be created using
+    the <xref linkend="sql-createcollation"> command.
    </para>
 
    <para>
@@ -566,8 +617,8 @@ <title>Managing Collations</title>
     Use of the stripped collation names is recommended, since it will
     make one less thing you need to change if you decide to change to
     another database encoding.  Note however that the <literal>default</>,
-    <literal>C</>, and <literal>POSIX</> collations can be used
-    regardless of the database encoding.
+    <literal>C</>, and <literal>POSIX</> collations, as well as all collations
+    provided by ICU can be used regardless of the database encoding.
    </para>
 
    <para>
@@ -581,6 +632,97 @@ <title>Managing Collations</title>
     collations have identical behaviors.  Mixing stripped and non-stripped
     collation names is therefore not recommended.
    </para>
+  </sect4>
+
+  <sect4>
+   <title>ICU collations</title>
+
+   <para>
+    With ICU, it is not sensible to enumerate all possible locale names.  ICU
+    uses a particular naming system for locales, but there are many more ways
+    to name a locale than there are actually distinct locales.  (In fact, any
+    string will be accepted as a locale name.)
+    See <ulink url="http://userguide.icu-project.org/locale"></ulink> for
+    information on ICU locale naming.  <command>initdb</command> uses the ICU
+    APIs to extract a set of locales with distinct collation rules to populate
+    the initial set of collations.  Here are some examples collations that
+    might be created:
+
+    <variablelist>
+     <varlistentry>
+      <term><literal>de%icu</literal></term>
+      <listitem>
+       <para>German collation, default variant</para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de_phonebook%icu</literal></term>
+      <listitem>
+       <para>German collation, phone book variant</para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de_AT%icu</literal></term>
+      <listitem>
+       <para>German collation for Austria, default variant</para>
+       <para>
+        (Note that as of this writing, there is no,
+        say, <literal>de_DE%icu</literal> or <literal>de_CH%icu</literal>,
+        since those are equivalent to <literal>de%icu</literal>.)
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de_AT_phonebook%icu</literal></term>
+      <listitem>
+       <para>German collation for Austria, phone book variant</para>
+      </listitem>
+     </varlistentry>
+     <varlistentry>
+      <term><literal>root%icu</literal></term>
+      <listitem>
+       <para>
+        ICU <quote>root</quote> collation.  Use this to get a reasonable
+        language-agnostic sort order.
+       </para>
+      </listitem>
+     </varlistentry>
+    </variablelist>
+   </para>
+
+   <para>
+    Some (less frequently used) encodings are not supported by ICU.  If the
+    database cluster was initialized with such an encoding, no ICU collations
+    will be predefined.
+   </para>
+   </sect4>
+   </sect3>
+
+   <sect3>
+   <title>Copying Collations</title>
+
+   <para>
+    The command <xref linkend="sql-createcollation"> can also be used to
+    create a new collation from an existing collation, which can be useful to
+    be able to use operating-system-independent collation names in
+    applications, create compatibility names, or use an ICU-provided collation
+    under a suffix-less name.  For example:
+<programlisting>
+CREATE COLLATION german FROM "de_DE";
+CREATE COLLATION french FROM "fr%icu";
+CREATE COLLATION "de_DE%icu" FROM "de%icu";
+</programlisting>
+   </para>
+
+   <para>
+    The standard and predefined collations are in the
+    schema <literal>pg_catalog</literal>, like all predefined objects.
+    User-defined collations should be created in user schemas.  This also
+    ensures that they are saved by <command>pg_dump</command>.
+   </para>
   </sect2>
  </sect1>
 
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 568995c9f2..b35ffa1e5c 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -767,6 +767,20 @@ <title>Configuration</title>
       </varlistentry>
 
       <varlistentry>
+       <term><option>--with-icu</option></term>
+       <listitem>
+        <para>
+         Build with support for
+         the <productname>ICU</productname><indexterm><primary>ICU</></>
+         library.  This requires the <productname>ICU4C</productname> package
+         as well
+         as <productname>pkg-config</productname><indexterm><primary>pkg-config</></>
+         to be installed.
+        </para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
        <term><option>--with-openssl</option>
        <indexterm>
         <primary>OpenSSL</primary>
diff --git a/doc/src/sgml/ref/alter_collation.sgml b/doc/src/sgml/ref/alter_collation.sgml
index 6708c7e10e..3bca3f1396 100644
--- a/doc/src/sgml/ref/alter_collation.sgml
+++ b/doc/src/sgml/ref/alter_collation.sgml
@@ -21,6 +21,8 @@
 
  <refsynopsisdiv>
 <synopsis>
+ALTER COLLATION <replaceable>name</replaceable> REFRESH VERSION
+
 ALTER COLLATION <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
 ALTER COLLATION <replaceable>name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
 ALTER COLLATION <replaceable>name</replaceable> SET SCHEMA <replaceable>new_schema</replaceable>
@@ -85,9 +87,49 @@ <title>Parameters</title>
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term><literal>REFRESH VERSION</literal></term>
+    <listitem>
+     <para>
+      Updated the collation version.
+      See <xref linkend="sql-altercollation-notes"> below.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </refsect1>
 
+ <refsect1 id="sql-altercollation-notes">
+  <title>Notes</title>
+
+  <para>
+   When using collations provided by the ICU library, the ICU-specific version
+   of the collator is recorded in the system catalog when the collation object
+   is created.  When the collation is then used, the current version is
+   checked against the recorded version, and a warning is issued when there is
+   a mismatch, for example:
+<screen>
+WARNING:  ICU collator version mismatch
+DETAIL:  The database was created using version 1.2.3.4, the library provides version 2.3.4.5.
+HINT:  Rebuild all objects affected by this collation and run ALTER COLLATION pg_catalog."xx%icu" REFRESH VERSION, or build PostgreSQL with the right version of ICU.
+</screen>
+   A change in collation definitions can lead to corrupt indexes and other
+   problems where the database system relies on stored objects having a
+   certain sort order.  Generally, this should be avoided, but it can happen
+   in legitimate circumstances, such as when
+   using <command>pg_upgrade</command> to upgrade to server binaries linked
+   with a newer version of ICU.  When this happens, all objects depending on
+   the collation should be rebuilt, for example,
+   using <command>REINDEX</command>.  When that is done, the collation version
+   can be refreshed using the command <literal>ALTER COLLATION ... REFRESH
+   VERSION</literal>.  This will update the system catalog to record the
+   current collator version and will make the warning go away.  Note that this
+   does not actually check whether all affected objects have been rebuilt
+   correctly.
+  </para>
+ </refsect1>
+
  <refsect1>
   <title>Examples</title>
 
diff --git a/doc/src/sgml/ref/create_collation.sgml b/doc/src/sgml/ref/create_collation.sgml
index c09e5bd6d4..ee7a8ea5ae 100644
--- a/doc/src/sgml/ref/create_collation.sgml
+++ b/doc/src/sgml/ref/create_collation.sgml
@@ -21,7 +21,8 @@
 CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> (
     [ LOCALE = <replaceable>locale</replaceable>, ]
     [ LC_COLLATE = <replaceable>lc_collate</replaceable>, ]
-    [ LC_CTYPE = <replaceable>lc_ctype</replaceable> ]
+    [ LC_CTYPE = <replaceable>lc_ctype</replaceable>, ]
+    [ PROVIDER = <replaceable>provider</replaceable> ]
 )
 CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replaceable>existing_collation</replaceable>
 </synopsis>
@@ -114,6 +115,19 @@ <title>Parameters</title>
     </varlistentry>
 
     <varlistentry>
+     <term><replaceable>provider</replaceable></term>
+
+     <listitem>
+      <para>
+       Specifies the provider to use for locale services associated with this
+       collation.  Possible values
+       are: <literal>icu</literal>,<indexterm><primary>ICU</></> <literal>libc</literal>.
+       The available choices depend on the operating system and build options.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
      <term><replaceable>existing_collation</replaceable></term>
 
      <listitem>
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 831d39a9d1..8a9815dfef 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -179,6 +179,7 @@ pgxsdir = $(pkglibdir)/pgxs
 #
 # Records the choice of the various --enable-xxx and --with-xxx options.
 
+with_icu	= @with_icu@
 with_perl	= @with_perl@
 with_python	= @with_python@
 with_tcl	= @with_tcl@
@@ -208,6 +209,9 @@ python_version		= @python_version@
 
 krb_srvtab = @krb_srvtab@
 
+ICU_CFLAGS		= @ICU_CFLAGS@
+ICU_LIBS		= @ICU_LIBS@
+
 TCLSH			= @TCLSH@
 TCL_LIBS		= @TCL_LIBS@
 TCL_LIB_SPEC		= @TCL_LIB_SPEC@
diff --git a/src/backend/Makefile b/src/backend/Makefile
index 7a0bbb2942..fffb0d95ba 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -58,7 +58,7 @@ ifneq ($(PORTNAME), win32)
 ifneq ($(PORTNAME), aix)
 
 postgres: $(OBJS)
-	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) -o $@
+	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) $(ICU_LIBS) -o $@
 
 endif
 endif
diff --git a/src/backend/catalog/pg_collation.c b/src/backend/catalog/pg_collation.c
index 65b6051c0d..ffe68aa114 100644
--- a/src/backend/catalog/pg_collation.c
+++ b/src/backend/catalog/pg_collation.c
@@ -27,6 +27,7 @@
 #include "mb/pg_wchar.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
+#include "utils/pg_locale.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
@@ -40,10 +41,12 @@
 Oid
 CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
 				const char *collcollate, const char *collctype,
 				bool if_not_exists)
 {
+	char	   *collversion;
 	Relation	rel;
 	TupleDesc	tupDesc;
 	HeapTuple	tup;
@@ -78,21 +81,27 @@ CollationCreate(const char *collname, Oid collnamespace,
 		{
 			ereport(NOTICE,
 				(errcode(ERRCODE_DUPLICATE_OBJECT),
-				 errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping",
-						collname, pg_encoding_to_char(collencoding))));
+				 collencoding == -1
+				 ? errmsg("collation \"%s\" already exists, skipping",
+						  collname)
+				 : errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping",
+						  collname, pg_encoding_to_char(collencoding))));
 			return InvalidOid;
 		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_DUPLICATE_OBJECT),
-					 errmsg("collation \"%s\" for encoding \"%s\" already exists",
-							collname, pg_encoding_to_char(collencoding))));
+					 collencoding == -1
+					 ? errmsg("collation \"%s\" already exists",
+							  collname)
+					 : errmsg("collation \"%s\" for encoding \"%s\" already exists",
+							  collname, pg_encoding_to_char(collencoding))));
 	}
 
 	/*
 	 * Also forbid matching an any-encoding entry.  This test of course is not
 	 * backed up by the unique index, but it's not a problem since we don't
-	 * support adding any-encoding entries after initdb.
+	 * support adding any-encoding entries after initdb. FIXME
 	 */
 	if (SearchSysCacheExists3(COLLNAMEENCNSP,
 							  PointerGetDatum(collname),
@@ -114,6 +123,8 @@ CollationCreate(const char *collname, Oid collnamespace,
 						collname)));
 	}
 
+	collversion = get_system_collation_version(collprovider, collcollate);
+
 	/* open pg_collation */
 	rel = heap_open(CollationRelationId, RowExclusiveLock);
 	tupDesc = RelationGetDescr(rel);
@@ -125,11 +136,16 @@ CollationCreate(const char *collname, Oid collnamespace,
 	values[Anum_pg_collation_collname - 1] = NameGetDatum(&name_name);
 	values[Anum_pg_collation_collnamespace - 1] = ObjectIdGetDatum(collnamespace);
 	values[Anum_pg_collation_collowner - 1] = ObjectIdGetDatum(collowner);
+	values[Anum_pg_collation_collprovider - 1] = CharGetDatum(collprovider);
 	values[Anum_pg_collation_collencoding - 1] = Int32GetDatum(collencoding);
 	namestrcpy(&name_collate, collcollate);
 	values[Anum_pg_collation_collcollate - 1] = NameGetDatum(&name_collate);
 	namestrcpy(&name_ctype, collctype);
 	values[Anum_pg_collation_collctype - 1] = NameGetDatum(&name_ctype);
+	if (collversion)
+		values[Anum_pg_collation_collversion - 1] = CStringGetTextDatum(collversion);
+	else
+		nulls[Anum_pg_collation_collversion - 1] = true;
 
 	tup = heap_form_tuple(tupDesc, values, nulls);
 
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 919cfc6a06..5f9f142de8 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -14,15 +14,18 @@
  */
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/dependency.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/objectaccess.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_collation_fn.h"
 #include "commands/alter.h"
 #include "commands/collationcmds.h"
+#include "commands/comment.h"
 #include "commands/dbcommands.h"
 #include "commands/defrem.h"
 #include "mb/pg_wchar.h"
@@ -33,6 +36,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
+
 /*
  * CREATE COLLATION
  */
@@ -47,8 +51,12 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 	DefElem    *localeEl = NULL;
 	DefElem    *lccollateEl = NULL;
 	DefElem    *lcctypeEl = NULL;
+	DefElem    *providerEl = NULL;
 	char	   *collcollate = NULL;
 	char	   *collctype = NULL;
+	char	   *collproviderstr = NULL;
+	int			collencoding;
+	char		collprovider = 0;
 	Oid			newoid;
 	ObjectAddress address;
 
@@ -72,6 +80,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 			defelp = &lccollateEl;
 		else if (pg_strcasecmp(defel->defname, "lc_ctype") == 0)
 			defelp = &lcctypeEl;
+		else if (pg_strcasecmp(defel->defname, "provider") == 0)
+			defelp = &providerEl;
 		else
 		{
 			ereport(ERROR,
@@ -103,6 +113,7 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 
 		collcollate = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collcollate));
 		collctype = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collctype));
+		collprovider = ((Form_pg_collation) GETSTRUCT(tp))->collprovider;
 
 		ReleaseSysCache(tp);
 	}
@@ -119,6 +130,24 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 	if (lcctypeEl)
 		collctype = defGetString(lcctypeEl);
 
+	if (providerEl)
+		collproviderstr = defGetString(providerEl);
+
+	if (collproviderstr)
+	{
+		if (pg_strcasecmp(collproviderstr, "icu") == 0)
+			collprovider = COLLPROVIDER_ICU;
+		else if (pg_strcasecmp(collproviderstr, "libc") == 0)
+			collprovider = COLLPROVIDER_LIBC;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+					 errmsg("unrecognized collation provider: %s",
+							collproviderstr)));
+	}
+	else if (!fromEl)
+		collprovider = COLLPROVIDER_LIBC;
+
 	if (!collcollate)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
@@ -129,12 +158,19 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
 				 errmsg("parameter \"lc_ctype\" must be specified")));
 
-	check_encoding_locale_matches(GetDatabaseEncoding(), collcollate, collctype);
+	if (collprovider == COLLPROVIDER_ICU)
+		collencoding = -1;
+	else
+	{
+		collencoding = GetDatabaseEncoding();
+		check_encoding_locale_matches(collencoding, collcollate, collctype);
+	}
 
 	newoid = CollationCreate(collName,
 							 collNamespace,
 							 GetUserId(),
-							 GetDatabaseEncoding(),
+							 collprovider,
+							 collencoding,
 							 collcollate,
 							 collctype,
 							 if_not_exists);
@@ -182,6 +218,79 @@ IsThereCollationInNamespace(const char *collname, Oid nspOid)
 						collname, get_namespace_name(nspOid))));
 }
 
+/*
+ * ALTER COLLATION
+ */
+ObjectAddress
+AlterCollation(AlterCollationStmt *stmt)
+{
+	Relation	rel;
+	Oid			collOid;
+	HeapTuple	tup;
+	Form_pg_collation collForm;
+	Datum		collversion;
+	bool		isnull;
+	char	   *oldversion;
+	char	   *newversion;
+	ObjectAddress address;
+
+	rel = heap_open(CollationRelationId, RowExclusiveLock);
+	collOid = get_collation_oid(stmt->collname, false);
+
+	if (!pg_collation_ownercheck(collOid, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_COLLATION,
+					   NameListToString(stmt->collname));
+
+	tup = SearchSysCacheCopy1(COLLOID, ObjectIdGetDatum(collOid));
+	if (!HeapTupleIsValid(tup))
+		elog(ERROR, "cache lookup failed for collation %u", collOid);
+
+	collForm = (Form_pg_collation) GETSTRUCT(tup);
+	collversion = SysCacheGetAttr(COLLOID, tup, Anum_pg_collation_collversion,
+								  &isnull);
+	oldversion = isnull ? NULL : TextDatumGetCString(collversion);
+
+	newversion = get_system_collation_version(collForm->collprovider, NameStr(collForm->collcollate));
+
+	/* cannot change from NULL to non-NULL or vice versa */
+	if ((!oldversion && newversion) || (oldversion && !newversion))
+		elog(ERROR, "invalid collation version change");
+	else if (oldversion && newversion && strcmp(newversion, oldversion) != 0)
+	{
+		bool        nulls[Natts_pg_collation];
+		bool        replaces[Natts_pg_collation];
+		Datum       values[Natts_pg_collation];
+
+		ereport(NOTICE,
+				(errmsg("changing version from %s to %s",
+						oldversion, newversion)));
+
+		memset(values, 0, sizeof(values));
+		memset(nulls, false, sizeof(nulls));
+		memset(replaces, false, sizeof(replaces));
+
+		values[Anum_pg_collation_collversion - 1] = CStringGetTextDatum(newversion);
+		replaces[Anum_pg_collation_collversion - 1] = true;
+
+		tup = heap_modify_tuple(tup, RelationGetDescr(rel),
+								values, nulls, replaces);
+	}
+	else
+		ereport(NOTICE,
+				(errmsg("version has not changed")));
+
+	CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+	InvokeObjectPostAlterHook(CollationRelationId, collOid, 0);
+
+	ObjectAddressSet(address, CollationRelationId, collOid);
+
+	heap_freetuple(tup);
+	heap_close(rel, NoLock);
+
+	return address;
+}
+
 
 /*
  * "Normalize" a locale name, stripping off encoding tags such as
@@ -219,6 +328,29 @@ normalize_locale_name(char *new, const char *old)
 }
 
 
+#ifdef USE_ICU
+static char *
+get_icu_locale_comment(const char *localename)
+{
+	UErrorCode	status;
+	UChar		displayname[128];
+	int32		len_uchar;
+	char	   *result;
+
+	status = U_ZERO_ERROR;
+	len_uchar = uloc_getDisplayName(localename, "en", &displayname[0], sizeof(displayname), &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("could get display name for locale \"%s\": %s",
+						localename, u_errorName(status))));
+
+	icu_from_uchar(&result, displayname, len_uchar);
+
+	return result;
+}
+#endif	/* USE_ICU */
+
+
 Datum
 pg_import_system_collations(PG_FUNCTION_ARGS)
 {
@@ -302,7 +434,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 
 		count++;
 
-		CollationCreate(localebuf, nspid, GetUserId(), enc,
+		CollationCreate(localebuf, nspid, GetUserId(), COLLPROVIDER_LIBC, enc,
 						localebuf, localebuf, if_not_exists);
 
 		CommandCounterIncrement();
@@ -333,7 +465,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 		char	   *locale = (char *) lfirst(lcl);
 		int			enc = lfirst_int(lce);
 
-		CollationCreate(alias, nspid, GetUserId(), enc,
+		CollationCreate(alias, nspid, GetUserId(), COLLPROVIDER_LIBC, enc,
 						locale, locale, true);
 		CommandCounterIncrement();
 	}
@@ -343,5 +475,70 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 				(errmsg("no usable system locales were found")));
 #endif   /* not HAVE_LOCALE_T && not WIN32 */
 
+#ifdef USE_ICU
+	if (!is_encoding_supported_by_icu(GetDatabaseEncoding()))
+	{
+		ereport(NOTICE,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("encoding \"%s\" not supported by ICU",
+						pg_encoding_to_char(GetDatabaseEncoding()))));
+	}
+	else
+	{
+		int i;
+
+		/*
+		 * Start the loop at -1 to sneak in the root locale without too much
+		 * code duplication.
+		 */
+		for (i = -1; i < ucol_countAvailable(); i++)
+		{
+			const char *name;
+			UEnumeration *en;
+			UErrorCode	status;
+			const char *val;
+			Oid			collid;
+
+			if (i == -1)
+				name = "";  /* ICU root locale */
+			else
+				name = ucol_getAvailable(i);
+			collid = CollationCreate(psprintf("%s%%icu", i == -1 ? "root" : name),
+									 nspid, GetUserId(), COLLPROVIDER_ICU, -1,
+									 name, name, if_not_exists);
+
+			CreateComments(collid, CollationRelationId, 0,
+						   get_icu_locale_comment(name));
+
+			/*
+			 * Add keyword variants
+			 */
+			status = U_ZERO_ERROR;
+			en = ucol_getKeywordValuesForLocale("collation", name, TRUE, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not get keyword values for locale \"%s\": %s",
+								name, u_errorName(status))));
+
+			status = U_ZERO_ERROR;
+			uenum_reset(en, &status);
+			while ((val = uenum_next(en, NULL, &status)))
+			{
+				char *localename = psprintf("%s@collation=%s", name, val);
+				collid = CollationCreate(psprintf("%s_%s%%icu", i == -1 ? "root" : name, val),
+										 nspid, GetUserId(), COLLPROVIDER_ICU, -1,
+										 localename, localename, if_not_exists);
+				CreateComments(collid, CollationRelationId, 0,
+							   get_icu_locale_comment(localename));
+			}
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not get keyword values for locale \"%s\": %s",
+								name, u_errorName(status))));
+			uenum_close(en);
+		}
+	}
+#endif
+
 	PG_RETURN_VOID();
 }
diff --git a/src/backend/common.mk b/src/backend/common.mk
index 5d599dbd0c..0b57543bc4 100644
--- a/src/backend/common.mk
+++ b/src/backend/common.mk
@@ -8,6 +8,8 @@
 # this directory and SUBDIRS to subdirectories containing more things
 # to build.
 
+override CPPFLAGS := $(CPPFLAGS) $(ICU_CFLAGS)
+
 ifdef PARTIAL_LINKING
 # old style: linking using SUBSYS.o
 subsysfilename = SUBSYS.o
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index bfc2ac1716..71daabbf97 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3028,6 +3028,16 @@ _copyAlterTableCmd(const AlterTableCmd *from)
 	return newnode;
 }
 
+static AlterCollationStmt *
+_copyAlterCollationStmt(const AlterCollationStmt *from)
+{
+	AlterCollationStmt *newnode = makeNode(AlterCollationStmt);
+
+	COPY_NODE_FIELD(collname);
+
+	return newnode;
+}
+
 static AlterDomainStmt *
 _copyAlterDomainStmt(const AlterDomainStmt *from)
 {
@@ -4959,6 +4969,9 @@ copyObject(const void *from)
 		case T_AlterTableCmd:
 			retval = _copyAlterTableCmd(from);
 			break;
+		case T_AlterCollationStmt:
+			retval = _copyAlterCollationStmt(from);
+			break;
 		case T_AlterDomainStmt:
 			retval = _copyAlterDomainStmt(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 54e9c983a0..57fe7ca61e 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1087,6 +1087,14 @@ _equalAlterTableCmd(const AlterTableCmd *a, const AlterTableCmd *b)
 }
 
 static bool
+_equalAlterCollationStmt(const AlterCollationStmt *a, const AlterCollationStmt *b)
+{
+	COMPARE_NODE_FIELD(collname);
+
+	return true;
+}
+
+static bool
 _equalAlterDomainStmt(const AlterDomainStmt *a, const AlterDomainStmt *b)
 {
 	COMPARE_SCALAR_FIELD(subtype);
@@ -3156,6 +3164,9 @@ equal(const void *a, const void *b)
 		case T_AlterTableCmd:
 			retval = _equalAlterTableCmd(a, b);
 			break;
+		case T_AlterCollationStmt:
+			retval = _equalAlterCollationStmt(a, b);
+			break;
 		case T_AlterDomainStmt:
 			retval = _equalAlterDomainStmt(a, b);
 			break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e7acc2d9a2..e337abd78d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -244,7 +244,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 }
 
 %type <node>	stmt schema_stmt
-		AlterEventTrigStmt
+		AlterEventTrigStmt AlterCollationStmt
 		AlterDatabaseStmt AlterDatabaseSetStmt AlterDomainStmt AlterEnumStmt
 		AlterFdwStmt AlterForeignServerStmt AlterGroupStmt
 		AlterObjectDependsStmt AlterObjectSchemaStmt AlterOwnerStmt
@@ -812,6 +812,7 @@ stmtmulti:	stmtmulti ';' stmt
 
 stmt :
 			AlterEventTrigStmt
+			| AlterCollationStmt
 			| AlterDatabaseStmt
 			| AlterDatabaseSetStmt
 			| AlterDefaultPrivilegesStmt
@@ -9633,6 +9634,21 @@ DropdbStmt: DROP DATABASE database_name
 
 /*****************************************************************************
  *
+ *		ALTER COLLATION
+ *
+ *****************************************************************************/
+
+AlterCollationStmt: ALTER COLLATION any_name REFRESH VERSION_P
+				{
+					AlterCollationStmt *n = makeNode(AlterCollationStmt);
+					n->collname = $3;
+					$$ = (Node *)n;
+				}
+		;
+
+
+/*****************************************************************************
+ *
  *		ALTER SYSTEM
  *
  * This is used to change configuration parameters persistently.
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index afa3a7d613..55c8ab12d6 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -68,7 +68,8 @@ typedef enum
 	PG_REGEX_LOCALE_WIDE,		/* Use <wctype.h> functions */
 	PG_REGEX_LOCALE_1BYTE,		/* Use <ctype.h> functions */
 	PG_REGEX_LOCALE_WIDE_L,		/* Use locale_t <wctype.h> functions */
-	PG_REGEX_LOCALE_1BYTE_L		/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_1BYTE_L,	/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_ICU			/* Use ICU uchar.h functions */
 } PG_Locale_Strategy;
 
 static PG_Locale_Strategy pg_regex_strategy;
@@ -262,6 +263,11 @@ pg_set_regex_collation(Oid collation)
 					 errhint("Use the COLLATE clause to set the collation explicitly.")));
 		}
 
+#ifdef USE_ICU
+		if (pg_regex_locale && pg_regex_locale->provider == COLLPROVIDER_ICU)
+			pg_regex_strategy = PG_REGEX_LOCALE_ICU;
+		else
+#endif
 #ifdef USE_WIDE_UPPER_LOWER
 		if (GetDatabaseEncoding() == PG_UTF8)
 		{
@@ -303,13 +309,18 @@ pg_wc_isdigit(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswdigit_l((wint_t) c, pg_regex_locale);
+				return iswdigit_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isdigit_l((unsigned char) c, pg_regex_locale));
+					isdigit_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isdigit(c);
 #endif
 			break;
 	}
@@ -336,13 +347,18 @@ pg_wc_isalpha(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalpha_l((wint_t) c, pg_regex_locale);
+				return iswalpha_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalpha_l((unsigned char) c, pg_regex_locale));
+					isalpha_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalpha(c);
 #endif
 			break;
 	}
@@ -369,13 +385,18 @@ pg_wc_isalnum(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalnum_l((wint_t) c, pg_regex_locale);
+				return iswalnum_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalnum_l((unsigned char) c, pg_regex_locale));
+					isalnum_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalnum(c);
 #endif
 			break;
 	}
@@ -402,13 +423,18 @@ pg_wc_isupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswupper_l((wint_t) c, pg_regex_locale);
+				return iswupper_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isupper_l((unsigned char) c, pg_regex_locale));
+					isupper_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isupper(c);
 #endif
 			break;
 	}
@@ -435,13 +461,18 @@ pg_wc_islower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswlower_l((wint_t) c, pg_regex_locale);
+				return iswlower_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					islower_l((unsigned char) c, pg_regex_locale));
+					islower_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_islower(c);
 #endif
 			break;
 	}
@@ -468,13 +499,18 @@ pg_wc_isgraph(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswgraph_l((wint_t) c, pg_regex_locale);
+				return iswgraph_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isgraph_l((unsigned char) c, pg_regex_locale));
+					isgraph_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isgraph(c);
 #endif
 			break;
 	}
@@ -501,13 +537,18 @@ pg_wc_isprint(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswprint_l((wint_t) c, pg_regex_locale);
+				return iswprint_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isprint_l((unsigned char) c, pg_regex_locale));
+					isprint_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isprint(c);
 #endif
 			break;
 	}
@@ -534,13 +575,18 @@ pg_wc_ispunct(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswpunct_l((wint_t) c, pg_regex_locale);
+				return iswpunct_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					ispunct_l((unsigned char) c, pg_regex_locale));
+					ispunct_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_ispunct(c);
 #endif
 			break;
 	}
@@ -567,13 +613,18 @@ pg_wc_isspace(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswspace_l((wint_t) c, pg_regex_locale);
+				return iswspace_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isspace_l((unsigned char) c, pg_regex_locale));
+					isspace_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isspace(c);
 #endif
 			break;
 	}
@@ -608,15 +659,20 @@ pg_wc_toupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towupper_l((wint_t) c, pg_regex_locale);
+				return towupper_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return toupper_l((unsigned char) c, pg_regex_locale);
+				return toupper_l((unsigned char) c, pg_regex_locale->info.lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_toupper(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -649,15 +705,20 @@ pg_wc_tolower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towlower_l((wint_t) c, pg_regex_locale);
+				return towlower_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return tolower_l((unsigned char) c, pg_regex_locale);
+				return tolower_l((unsigned char) c, pg_regex_locale->info.lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_tolower(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -808,6 +869,9 @@ pg_ctype_get_cache(pg_wc_probefunc probefunc, int cclasscode)
 			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
 #endif
 			break;
+		case PG_REGEX_LOCALE_ICU:
+			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
+			break;
 		default:
 			max_chr = 0;		/* can't get here, but keep compiler quiet */
 			break;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 20b5273405..c8d20fffea 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1623,6 +1623,10 @@ ProcessUtilitySlow(ParseState *pstate,
 				commandCollected = true;
 				break;
 
+			case T_AlterCollationStmt:
+				address = AlterCollation((AlterCollationStmt *) parsetree);
+				break;
+
 			default:
 				elog(ERROR, "unrecognized node type: %d",
 					 (int) nodeTag(parsetree));
@@ -2673,6 +2677,10 @@ CreateCommandTag(Node *parsetree)
 			tag = "DROP SUBSCRIPTION";
 			break;
 
+		case T_AlterCollationStmt:
+			tag = "ALTER COLLATION";
+			break;
+
 		case T_PrepareStmt:
 			tag = "PREPARE";
 			break;
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index e552c8d20b..6fedcf3c9b 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -82,6 +82,10 @@
 #include <wctype.h>
 #endif
 
+#ifdef USE_ICU
+#include <unicode/ustring.h>
+#endif
+
 #include "catalog/pg_collation.h"
 #include "mb/pg_wchar.h"
 #include "utils/builtins.h"
@@ -1443,6 +1447,42 @@ str_numth(char *dest, char *num, int type)
  *			upper/lower/initcap functions
  *****************************************************************************/
 
+#ifdef USE_ICU
+static int32_t
+icu_convert_case(int32_t (*func)(UChar *, int32_t, const UChar *, int32_t, const char *, UErrorCode *),
+				 pg_locale_t mylocale, UChar **buff_dest, UChar *buff_source, int32_t len_source)
+{
+	UErrorCode	status;
+	int32_t		len_dest;
+
+	len_dest = len_source;  /* try first with same length */
+	*buff_dest = palloc(len_dest * sizeof(**buff_dest));
+	status = U_ZERO_ERROR;
+	len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->info.icu.locale, &status);
+	if (status == U_BUFFER_OVERFLOW_ERROR)
+	{
+		/* try again with adjusted length */
+		pfree(buff_dest);
+		buff_dest = palloc(len_dest * sizeof(**buff_dest));
+		status = U_ZERO_ERROR;
+		len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->info.icu.locale, &status);
+	}
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("case conversion failed: %s", u_errorName(status))));
+	return len_dest;
+}
+
+static int32_t
+u_strToTitle_default_BI(UChar *dest, int32_t destCapacity,
+						const UChar *src, int32_t srcLength,
+						const char *locale,
+						UErrorCode *pErrorCode)
+{
+	return u_strToTitle(dest, destCapacity, src, srcLength, NULL, locale, pErrorCode);
+}
+#endif
+
 /*
  * If the system provides the needed functions for wide-character manipulation
  * (which are all standardized by C99), then we implement upper/lower/initcap
@@ -1479,12 +1519,9 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 		result = asc_tolower(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1502,77 +1539,79 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar;
+			int32_t		len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToLower, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-			else
+					if (mylocale)
+						workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->info.lt);
+					else
 #endif
-				workspace[curr_char] = towlower(workspace[curr_char]);
-		}
+						workspace[curr_char] = towlower(workspace[curr_char]);
+				}
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
+			}
 #endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
-
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
+			else
 			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for lower() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that tolower_l() will not be so broken as to need
-		 * an isupper_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that tolower_l() will not be so broken as to need
+				 * an isupper_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = tolower_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = tolower_l((unsigned char) *p, mylocale->info.lt);
+					else
 #endif
-				*p = pg_tolower((unsigned char) *p);
+						*p = pg_tolower((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1599,12 +1638,9 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 		result = asc_toupper(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1622,77 +1658,78 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToUpper, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-			else
-#endif
-				workspace[curr_char] = towupper(workspace[curr_char]);
-		}
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
+					if (mylocale)
+						workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->info.lt);
+					else
 #endif
-		char	   *p;
+						workspace[curr_char] = towupper(workspace[curr_char]);
+				}
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for upper() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
+
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+#endif   /* USE_WIDE_UPPER_LOWER */
+			else
+			{
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that toupper_l() will not be so broken as to need
-		 * an islower_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that toupper_l() will not be so broken as to need
+				 * an islower_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = toupper_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = toupper_l((unsigned char) *p, mylocale->info.lt);
+					else
 #endif
-				*p = pg_toupper((unsigned char) *p);
+						*p = pg_toupper((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1720,12 +1757,9 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 		result = asc_initcap(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1743,100 +1777,101 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
-
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
-
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
-
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
 		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-				else
-					workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-				wasalnum = iswalnum_l(workspace[curr_char], mylocale);
-			}
-			else
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
 #endif
+		{
+			if (pg_database_encoding_max_length() > 1)
 			{
-				if (wasalnum)
-					workspace[curr_char] = towlower(workspace[curr_char]);
-				else
-					workspace[curr_char] = towupper(workspace[curr_char]);
-				wasalnum = iswalnum(workspace[curr_char]);
-			}
-		}
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for initcap() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
+					if (mylocale)
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->info.lt);
+						else
+							workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->info.lt);
+						wasalnum = iswalnum_l(workspace[curr_char], mylocale->info.lt);
+					}
+					else
 #endif
-		}
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower(workspace[curr_char]);
+						else
+							workspace[curr_char] = towupper(workspace[curr_char]);
+						wasalnum = iswalnum(workspace[curr_char]);
+					}
+				}
 
-		result = pnstrdup(buff, nbytes);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		/*
-		 * Note: we assume that toupper_l()/tolower_l() will not be so broken
-		 * as to need guard tests.  When using the default collation, we apply
-		 * the traditional Postgres behavior that forces ASCII-style treatment
-		 * of I/i, but in non-default collations you get exactly what the
-		 * collation says.
-		 */
-		for (p = result; *p; p++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					*p = tolower_l((unsigned char) *p, mylocale);
-				else
-					*p = toupper_l((unsigned char) *p, mylocale);
-				wasalnum = isalnum_l((unsigned char) *p, mylocale);
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
+#endif   /* USE_WIDE_UPPER_LOWER */
 			else
-#endif
 			{
-				if (wasalnum)
-					*p = pg_tolower((unsigned char) *p);
-				else
-					*p = pg_toupper((unsigned char) *p);
-				wasalnum = isalnum((unsigned char) *p);
+				char	   *p;
+
+				result = pnstrdup(buff, nbytes);
+
+				/*
+				 * Note: we assume that toupper_l()/tolower_l() will not be so broken
+				 * as to need guard tests.  When using the default collation, we apply
+				 * the traditional Postgres behavior that forces ASCII-style treatment
+				 * of I/i, but in non-default collations you get exactly what the
+				 * collation says.
+				 */
+				for (p = result; *p; p++)
+				{
+#ifdef HAVE_LOCALE_T
+					if (mylocale)
+					{
+						if (wasalnum)
+							*p = tolower_l((unsigned char) *p, mylocale->info.lt);
+						else
+							*p = toupper_l((unsigned char) *p, mylocale->info.lt);
+						wasalnum = isalnum_l((unsigned char) *p, mylocale->info.lt);
+					}
+					else
+#endif
+					{
+						if (wasalnum)
+							*p = pg_tolower((unsigned char) *p);
+						else
+							*p = pg_toupper((unsigned char) *p);
+						wasalnum = isalnum((unsigned char) *p);
+					}
+				}
 			}
 		}
 	}
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index 91fe109867..8a4f768de3 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -96,7 +96,7 @@ SB_lower_char(unsigned char c, pg_locale_t locale, bool locale_is_c)
 		return pg_ascii_tolower(c);
 #ifdef HAVE_LOCALE_T
 	else if (locale)
-		return tolower_l(c, locale);
+		return tolower_l(c, locale->info.lt);
 #endif
 	else
 		return pg_tolower(c);
@@ -165,14 +165,36 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 			   *p;
 	int			slen,
 				plen;
+	pg_locale_t locale = 0;
+	bool		locale_is_c = false;
+
+	if (lc_ctype_is_c(collation))
+		locale_is_c = true;
+	else if (collation != DEFAULT_COLLATION_OID)
+	{
+		if (!OidIsValid(collation))
+		{
+			/*
+			 * This typically means that the parser could not resolve a
+			 * conflict of implicit collations, so report it that way.
+			 */
+			ereport(ERROR,
+					(errcode(ERRCODE_INDETERMINATE_COLLATION),
+					 errmsg("could not determine which collation to use for ILIKE"),
+					 errhint("Use the COLLATE clause to set the collation explicitly.")));
+		}
+		locale = pg_newlocale_from_collation(collation);
+	}
 
 	/*
 	 * For efficiency reasons, in the single byte case we don't call lower()
 	 * on the pattern and text, but instead call SB_lower_char on each
-	 * character.  In the multi-byte case we don't have much choice :-(
+	 * character.  In the multi-byte case we don't have much choice :-(.
+	 * Also, ICU does not support single-character case folding, so we go the
+	 * long way.
 	 */
 
-	if (pg_database_encoding_max_length() > 1)
+	if (pg_database_encoding_max_length() > 1 || locale->provider == COLLPROVIDER_ICU)
 	{
 		/* lower's result is never packed, so OK to use old macros here */
 		pat = DatumGetTextP(DirectFunctionCall1Coll(lower, collation,
@@ -190,31 +212,6 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 	}
 	else
 	{
-		/*
-		 * Here we need to prepare locale information for SB_lower_char. This
-		 * should match the methods used in str_tolower().
-		 */
-		pg_locale_t locale = 0;
-		bool		locale_is_c = false;
-
-		if (lc_ctype_is_c(collation))
-			locale_is_c = true;
-		else if (collation != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collation))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for ILIKE"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-			locale = pg_newlocale_from_collation(collation);
-		}
-
 		p = VARDATA_ANY(pat);
 		plen = VARSIZE_ANY_EXHDR(pat);
 		s = VARDATA_ANY(str);
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index ab197025f8..4a10655c73 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -57,11 +57,17 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_control.h"
 #include "mb/pg_wchar.h"
+#include "utils/builtins.h"
 #include "utils/hsearch.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
 #include "utils/syscache.h"
 
+#ifdef USE_ICU
+#include <unicode/ucnv.h>
+#endif
+
 #ifdef WIN32
 /*
  * This Windows file defines StrNCpy. We don't need it here, so we undefine
@@ -1272,12 +1278,13 @@ pg_newlocale_from_collation(Oid collid)
 	if (cache_entry->locale == 0)
 	{
 		/* We haven't computed this yet in this session, so do it */
-#ifdef HAVE_LOCALE_T
 		HeapTuple	tp;
 		Form_pg_collation collform;
 		const char *collcollate;
-		const char *collctype;
-		locale_t	result;
+		const char *collctype pg_attribute_unused();
+		pg_locale_t	result;
+		Datum		collversion;
+		bool		isnull;
 
 		tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collid));
 		if (!HeapTupleIsValid(tp))
@@ -1287,61 +1294,223 @@ pg_newlocale_from_collation(Oid collid)
 		collcollate = NameStr(collform->collcollate);
 		collctype = NameStr(collform->collctype);
 
-		if (strcmp(collcollate, collctype) == 0)
+		result = malloc(sizeof(* result));
+		memset(result, 0, sizeof(* result));
+		result->provider = collform->collprovider;
+
+		if (collform->collprovider == COLLPROVIDER_LIBC)
 		{
-			/* Normal case where they're the same */
+#ifdef HAVE_LOCALE_T
+			locale_t	loc;
+
+			if (strcmp(collcollate, collctype) == 0)
+			{
+				/* Normal case where they're the same */
 #ifndef WIN32
-			result = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
-							   NULL);
+				loc = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
+								   NULL);
 #else
-			result = _create_locale(LC_ALL, collcollate);
+				loc = _create_locale(LC_ALL, collcollate);
 #endif
-			if (!result)
-				report_newlocale_failure(collcollate);
-		}
-		else
-		{
+				if (!loc)
+					report_newlocale_failure(collcollate);
+			}
+			else
+			{
 #ifndef WIN32
-			/* We need two newlocale() steps */
-			locale_t	loc1;
-
-			loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
-			if (!loc1)
-				report_newlocale_failure(collcollate);
-			result = newlocale(LC_CTYPE_MASK, collctype, loc1);
-			if (!result)
-				report_newlocale_failure(collctype);
+				/* We need two newlocale() steps */
+				locale_t	loc1;
+
+				loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
+				if (!loc1)
+					report_newlocale_failure(collcollate);
+				loc = newlocale(LC_CTYPE_MASK, collctype, loc1);
+				if (!loc)
+					report_newlocale_failure(collctype);
 #else
 
-			/*
-			 * XXX The _create_locale() API doesn't appear to support this.
-			 * Could perhaps be worked around by changing pg_locale_t to
-			 * contain two separate fields.
-			 */
+				/*
+				 * XXX The _create_locale() API doesn't appear to support this.
+				 * Could perhaps be worked around by changing pg_locale_t to
+				 * contain two separate fields.
+				 */
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("collations with different collate and ctype values are not supported on this platform")));
+#endif
+			}
+
+			result->info.lt = loc;
+#else							/* not HAVE_LOCALE_T */
+			/* platform that doesn't support locale_t */
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					 errmsg("collations with different collate and ctype values are not supported on this platform")));
-#endif
+					 errmsg("collation provider LIBC is not supported on this platform")));
+#endif   /* not HAVE_LOCALE_T */
+		}
+		else if (collform->collprovider == COLLPROVIDER_ICU)
+		{
+#ifdef USE_ICU
+			UCollator  *collator;
+			UErrorCode	status;
+
+			status = U_ZERO_ERROR;
+			collator = ucol_open(collcollate, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not open collator for locale \"%s\": %s",
+								collcollate, u_errorName(status))));
+
+			result->info.icu.locale = strdup(collcollate);
+			result->info.icu.ucol = collator;
+#else /* not USE_ICU */
+			/* could get here if a collation was created by a build with ICU */
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("ICU is not supported in this build"), \
+					 errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif /* not USE_ICU */
 		}
 
-		cache_entry->locale = result;
+		collversion = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collversion,
+									  &isnull);
+		if (!isnull)
+		{
+			char	   *sysversionstr;
+			char	   *collversionstr;
+
+			sysversionstr = get_system_collation_version(collform->collprovider, collcollate);
+			collversionstr = TextDatumGetCString(collversion);
+
+			if (strcmp(sysversionstr, collversionstr) != 0)
+				ereport(WARNING,
+						(errmsg("collation \"%s\" has version mismatch",
+								NameStr(collform->collname)),
+						 errdetail("The collation in the database was created using version %s, "
+								   "but the operating system provides version %s.",
+								   collversionstr, sysversionstr),
+						 errhint("Rebuild all objects affected by this collation and run "
+								 "ALTER COLLATION %s REFRESH VERSION, "
+								 "or build PostgreSQL with the right library version.",
+								 quote_qualified_identifier(get_namespace_name(collform->collnamespace),
+															NameStr(collform->collname)))));
+		}
 
 		ReleaseSysCache(tp);
-#else							/* not HAVE_LOCALE_T */
 
-		/*
-		 * For platforms that don't support locale_t, we can't do anything
-		 * with non-default collations.
-		 */
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-		errmsg("nondefault collations are not supported on this platform")));
-#endif   /* not HAVE_LOCALE_T */
+		cache_entry->locale = result;
 	}
 
 	return cache_entry->locale;
 }
 
+/*
+ * Get provider-specific collation version string for the given collation from
+ * the operating system/library.
+ *
+ * A particular provider must always either return a non-NULL string or return
+ * NULL (if it doesn't support versions).  It must not return NULL for some
+ * collcollate and not NULL for others.
+ */
+char *
+get_system_collation_version(char collprovider, const char *collcollate)
+{
+	char	   *collversion;
+
+#ifdef USE_ICU
+	if (collprovider == COLLPROVIDER_ICU)
+	{
+		UCollator  *collator;
+		UErrorCode	status;
+		UVersionInfo versioninfo;
+		char		buf[U_MAX_VERSION_STRING_LENGTH];
+
+		status = U_ZERO_ERROR;
+		collator = ucol_open(collcollate, &status);
+		if (U_FAILURE(status))
+			ereport(ERROR,
+					(errmsg("could not open collator for locale \"%s\": %s",
+							collcollate, u_errorName(status))));
+		ucol_getVersion(collator, versioninfo);
+		ucol_close(collator);
+
+		u_versionToString(versioninfo, buf);
+		collversion = pstrdup(buf);
+	}
+	else
+#endif
+		collversion = NULL;
+
+	return collversion;
+}
+
+
+#ifdef USE_ICU
+/*
+ * Converter object for converting between ICU's UChar strings and C strings
+ * in database encoding.  Since the database encoding doesn't change, we only
+ * need one of these per session.
+ */
+static UConverter *icu_converter = NULL;
+
+static void
+init_icu_converter(void)
+{
+	const char *icu_encoding_name;
+	UErrorCode	status;
+	UConverter *conv;
+
+	if (icu_converter)
+		return;
+
+	icu_encoding_name = get_encoding_name_for_icu(GetDatabaseEncoding());
+
+	status = U_ZERO_ERROR;
+	conv = ucnv_open(icu_encoding_name, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("could not open ICU converter for encoding \"%s\": %s",
+						icu_encoding_name, u_errorName(status))));
+
+	icu_converter = conv;
+}
+
+int32_t
+icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes)
+{
+	UErrorCode	status;
+	int32_t		len_uchar;
+
+	init_icu_converter();
+
+	len_uchar = 2 * nbytes;  /* max length per docs */
+	*buff_uchar = palloc(len_uchar * sizeof(**buff_uchar));
+	status = U_ZERO_ERROR;
+	len_uchar = ucnv_toUChars(icu_converter, *buff_uchar, len_uchar, buff, nbytes, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_toUChars failed: %s", u_errorName(status))));
+	return len_uchar;
+}
+
+int32_t
+icu_from_uchar(char **result, UChar *buff_uchar, int32_t len_uchar)
+{
+	UErrorCode	status;
+	int32_t		len_result;
+
+	init_icu_converter();
+
+	len_result = UCNV_GET_MAX_BYTES_FOR_STRING(len_uchar, ucnv_getMaxCharSize(icu_converter));
+	*result = palloc(len_result + 1);
+	status = U_ZERO_ERROR;
+	ucnv_fromUChars(icu_converter, *result, len_result, buff_uchar, len_uchar, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_fromUChars failed: %s", u_errorName(status))));
+	return len_result;
+}
+#endif
 
 /*
  * These functions convert from/to libc's wchar_t, *not* pg_wchar_t.
@@ -1362,6 +1531,8 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1398,10 +1569,10 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_WCSTOMBS_L
 		/* Use wcstombs_l for nondefault locales */
-		result = wcstombs_l(to, from, tolen, locale);
+		result = wcstombs_l(to, from, tolen, locale->info.lt);
 #else							/* !HAVE_WCSTOMBS_L */
 		/* We have to temporarily set the locale as current ... ugh */
-		locale_t	save_locale = uselocale(locale);
+		locale_t	save_locale = uselocale(locale->info.lt);
 
 		result = wcstombs(to, from, tolen);
 
@@ -1432,6 +1603,8 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1473,10 +1646,10 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_MBSTOWCS_L
 			/* Use mbstowcs_l for nondefault locales */
-			result = mbstowcs_l(to, str, tolen, locale);
+			result = mbstowcs_l(to, str, tolen, locale->info.lt);
 #else							/* !HAVE_MBSTOWCS_L */
 			/* We have to temporarily set the locale as current ... ugh */
-			locale_t	save_locale = uselocale(locale);
+			locale_t	save_locale = uselocale(locale->info.lt);
 
 			result = mbstowcs(to, str, tolen);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index a04fd7bc90..b70dadca5f 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5258,7 +5258,7 @@ find_join_input_rel(PlannerInfo *root, Relids relids)
 /*
  * Check whether char is a letter (and, hence, subject to case-folding)
  *
- * In multibyte character sets, we can't use isalpha, and it does not seem
+ * In multibyte character sets or with ICU, we can't use isalpha, and it does not seem
  * worth trying to convert to wchar_t to use iswalpha.  Instead, just assume
  * any multibyte char is potentially case-varying.
  */
@@ -5270,9 +5270,11 @@ pattern_char_isalpha(char c, bool is_multibyte,
 		return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
 	else if (is_multibyte && IS_HIGHBIT_SET(c))
 		return true;
+	else if (locale && locale->provider == COLLPROVIDER_ICU)
+		return IS_HIGHBIT_SET(c) ? true : false;
 #ifdef HAVE_LOCALE_T
-	else if (locale)
-		return isalpha_l((unsigned char) c, locale);
+	else if (locale && locale->provider == COLLPROVIDER_LIBC)
+		return isalpha_l((unsigned char) c, locale->info.lt);
 #endif
 	else
 		return isalpha((unsigned char) c);
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 28b5745ba8..9f8c2e3485 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1403,10 +1403,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		char		a2buf[TEXTBUFLEN];
 		char	   *a1p,
 				   *a2p;
-
-#ifdef HAVE_LOCALE_T
 		pg_locale_t mylocale = 0;
-#endif
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1421,9 +1418,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			mylocale = pg_newlocale_from_collation(collid);
-#endif
 		}
 
 		/*
@@ -1542,9 +1537,44 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		memcpy(a2p, arg2, len2);
 		a2p[len2] = '\0';
 
-#ifdef HAVE_LOCALE_T
 		if (mylocale)
-			result = strcoll_l(a1p, a2p, mylocale);
+		{
+#ifdef USE_ICU
+			if (mylocale->provider == COLLPROVIDER_ICU)
+			{
+#ifdef HAVE_UCOL_STRCOLLUTF8
+				if (GetDatabaseEncoding() == PG_UTF8)
+				{
+					UErrorCode	status;
+
+					status = U_ZERO_ERROR;
+					result = ucol_strcollUTF8(mylocale->info.icu.ucol,
+											  arg1, len1,
+											  arg2, len2,
+											  &status);
+					if (U_FAILURE(status))
+						ereport(ERROR,
+								(errmsg("collation failed: %s", u_errorName(status))));
+				}
+				else
+#endif
+				{
+					int32_t ulen1, ulen2;
+					UChar *uchar1, *uchar2;
+
+					ulen1 = icu_to_uchar(&uchar1, arg1, len1);
+					ulen2 = icu_to_uchar(&uchar2, arg2, len2);
+
+					result = ucol_strcoll(mylocale->info.icu.ucol,
+										  uchar1, ulen1,
+										  uchar2, ulen2);
+				}
+			}
+			else
+#endif
+#ifdef HAVE_LOCALE_T
+				result = strcoll_l(a1p, a2p, mylocale->info.lt);
+		}
 		else
 #endif
 			result = strcoll(a1p, a2p);
@@ -1768,10 +1798,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	bool		abbreviate = ssup->abbreviate;
 	bool		collate_c = false;
 	VarStringSortSupport *sss;
-
-#ifdef HAVE_LOCALE_T
 	pg_locale_t locale = 0;
-#endif
 
 	/*
 	 * If possible, set ssup->comparator to a function which can be used to
@@ -1826,9 +1853,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			locale = pg_newlocale_from_collation(collid);
-#endif
 		}
 	}
 
@@ -1854,7 +1879,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	 * platforms.
 	 */
 #ifndef TRUST_STRXFRM
-	if (!collate_c)
+	if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
 		abbreviate = false;
 #endif
 
@@ -2090,9 +2115,44 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 		goto done;
 	}
 
-#ifdef HAVE_LOCALE_T
 	if (sss->locale)
-		result = strcoll_l(sss->buf1, sss->buf2, sss->locale);
+	{
+#ifdef USE_ICU
+		if (sss->locale->provider == COLLPROVIDER_ICU)
+		{
+#ifdef HAVE_UCOL_STRCOLLUTF8
+			if (GetDatabaseEncoding() == PG_UTF8)
+			{
+				UErrorCode	status;
+
+				status = U_ZERO_ERROR;
+				result = ucol_strcollUTF8(sss->locale->info.icu.ucol,
+										  a1p, len1,
+										  a2p, len2,
+										  &status);
+				if (U_FAILURE(status))
+					ereport(ERROR,
+							(errmsg("collation failed: %s", u_errorName(status))));
+			}
+			else
+#endif
+			{
+				int32_t ulen1, ulen2;
+				UChar *uchar1, *uchar2;
+
+				ulen1 = icu_to_uchar(&uchar1, a1p, len1);
+				ulen2 = icu_to_uchar(&uchar2, a2p, len2);
+
+				result = ucol_strcoll(sss->locale->info.icu.ucol,
+									  uchar1, ulen1,
+									  uchar2, ulen2);
+			}
+		}
+		else
+#endif
+#ifdef HAVE_LOCALE_T
+			result = strcoll_l(sss->buf1, sss->buf2, sss->locale->info.lt);
+	}
 	else
 #endif
 		result = strcoll(sss->buf1, sss->buf2);
@@ -2200,9 +2260,14 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 	else
 	{
 		Size		bsize;
+#ifdef USE_ICU
+		int32_t		ulen = -1;
+		UChar	   *uchar;
+#endif
 
 		/*
-		 * We're not using the C collation, so fall back on strxfrm.
+		 * We're not using the C collation, so fall back on strxfrm or ICU
+		 * analogs.
 		 */
 
 		/* By convention, we use buffer 1 to store and NUL-terminate */
@@ -2222,17 +2287,66 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 			goto done;
 		}
 
-		/* Just like strcoll(), strxfrm() expects a NUL-terminated string */
 		memcpy(sss->buf1, authoritative_data, len);
+		/* Just like strcoll(), strxfrm() expects a NUL-terminated string.
+		 * Not necessary for ICU, but doesn't hurt. */
 		sss->buf1[len] = '\0';
 		sss->last_len1 = len;
 
+#ifdef USE_ICU
+		/* When using ICU and not UTF8, convert string to UChar. */
+		if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU &&
+			GetDatabaseEncoding() != PG_UTF8)
+			ulen = icu_to_uchar(&uchar, sss->buf1, len);
+#endif
+
+		/*
+		 * Loop: Call strxfrm() or ucol_getSortKey(), possibly enlarge buffer,
+		 * and try again.  Both of these functions have the result buffer
+		 * content undefined if the result did not fit, so we need to retry
+		 * until everything fits, even though we only need the first few bytes
+		 * in the end.  When using ucol_nextSortKeyPart(), however, we only
+		 * ask for as many bytes as we actually need.
+		 */
 		for (;;)
 		{
+#ifdef USE_ICU
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+			{
+				/*
+				 * When using UTF8, use the iteration interface so we only
+				 * need to produce as many bytes as we actually need.
+				 */
+				if (GetDatabaseEncoding() == PG_UTF8)
+				{
+					UCharIterator iter;
+					uint32_t	state[2];
+					UErrorCode	status;
+
+					uiter_setUTF8(&iter, sss->buf1, len);
+					state[0] = state[1] = 0;  /* won't need that again */
+					status = U_ZERO_ERROR;
+					bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol,
+												 &iter,
+												 state,
+												 (uint8_t *) sss->buf2,
+												 Min(sizeof(Datum), sss->buflen2),
+												 &status);
+					if (U_FAILURE(status))
+						ereport(ERROR,
+								(errmsg("sort key generation failed: %s", u_errorName(status))));
+				}
+				else
+					bsize = ucol_getSortKey(sss->locale->info.icu.ucol,
+											uchar, ulen,
+											(uint8_t *) sss->buf2, sss->buflen2);
+			}
+			else
+#endif
 #ifdef HAVE_LOCALE_T
-			if (sss->locale)
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_LIBC)
 				bsize = strxfrm_l(sss->buf2, sss->buf1,
-								  sss->buflen2, sss->locale);
+								  sss->buflen2, sss->locale->info.lt);
 			else
 #endif
 				bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
@@ -2242,8 +2356,7 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 				break;
 
 			/*
-			 * The C standard states that the contents of the buffer is now
-			 * unspecified.  Grow buffer, and retry.
+			 * Grow buffer and retry.
 			 */
 			pfree(sss->buf2);
 			sss->buflen2 = Max(bsize + 1,
diff --git a/src/backend/utils/mb/encnames.c b/src/backend/utils/mb/encnames.c
index 11099b844f..444eec25b5 100644
--- a/src/backend/utils/mb/encnames.c
+++ b/src/backend/utils/mb/encnames.c
@@ -403,6 +403,82 @@ const pg_enc2gettext pg_enc2gettext_tbl[] =
 };
 
 
+#ifndef FRONTEND
+
+/*
+ * Table of encoding names for ICU
+ *
+ * Reference: <https://ssl.icu-project.org/icu-bin/convexp>
+ *
+ * NULL entries are not supported by ICU, or their mapping is unclear.
+ */
+static const char * const pg_enc2icu_tbl[] =
+{
+	NULL,					/* PG_SQL_ASCII */
+	"EUC-JP",				/* PG_EUC_JP */
+	"EUC-CN",				/* PG_EUC_CN */
+	"EUC-KR",				/* PG_EUC_KR */
+	"EUC-TW",				/* PG_EUC_TW */
+	NULL,					/* PG_EUC_JIS_2004 */
+	"UTF-8",				/* PG_UTF8 */
+	NULL,					/* PG_MULE_INTERNAL */
+	"ISO-8859-1",			/* PG_LATIN1 */
+	"ISO-8859-2",			/* PG_LATIN2 */
+	"ISO-8859-3",			/* PG_LATIN3 */
+	"ISO-8859-4",			/* PG_LATIN4 */
+	"ISO-8859-9",			/* PG_LATIN5 */
+	"ISO-8859-10",			/* PG_LATIN6 */
+	"ISO-8859-13",			/* PG_LATIN7 */
+	"ISO-8859-14",			/* PG_LATIN8 */
+	"ISO-8859-15",			/* PG_LATIN9 */
+	NULL,					/* PG_LATIN10 */
+	"CP1256",				/* PG_WIN1256 */
+	"CP1258",				/* PG_WIN1258 */
+	"CP866",				/* PG_WIN866 */
+	NULL,					/* PG_WIN874 */
+	"KOI8-R",				/* PG_KOI8R */
+	"CP1251",				/* PG_WIN1251 */
+	"CP1252",				/* PG_WIN1252 */
+	"ISO-8859-5",			/* PG_ISO_8859_5 */
+	"ISO-8859-6",			/* PG_ISO_8859_6 */
+	"ISO-8859-7",			/* PG_ISO_8859_7 */
+	"ISO-8859-8",			/* PG_ISO_8859_8 */
+	"CP1250",				/* PG_WIN1250 */
+	"CP1253",				/* PG_WIN1253 */
+	"CP1254",				/* PG_WIN1254 */
+	"CP1255",				/* PG_WIN1255 */
+	"CP1257",				/* PG_WIN1257 */
+	"KOI8-U",				/* PG_KOI8U */
+};
+
+bool
+is_encoding_supported_by_icu(int encoding)
+{
+	return (pg_enc2icu_tbl[encoding] != NULL);
+}
+
+const char *
+get_encoding_name_for_icu(int encoding)
+{
+	const char *icu_encoding_name;
+
+	StaticAssertStmt(lengthof(pg_enc2icu_tbl) == PG_ENCODING_BE_LAST + 1,
+					 "pg_enc2icu_tbl incomplete");
+
+	icu_encoding_name = pg_enc2icu_tbl[encoding];
+
+	if (!icu_encoding_name)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("encoding \"%s\" not supported by ICU",
+						pg_encoding_to_char(encoding))));
+
+	return icu_encoding_name;
+}
+
+#endif /* not FRONTEND */
+
+
 /* ----------
  * Encoding checks, for error returns -1 else encoding id
  * ----------
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index da40d7ab67..637908487b 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -61,6 +61,7 @@
 
 #include "catalog/catalog.h"
 #include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
 #include "common/file_utils.h"
 #include "common/restricted_token.h"
 #include "common/username.h"
@@ -1620,7 +1621,7 @@ setup_collation(FILE *cmdfd)
 	PG_CMD_PUTS("SELECT pg_import_system_collations(if_not_exists => false, schema => 'pg_catalog');\n\n");
 
 	/* Add an SQL-standard name */
-	PG_CMD_PRINTF2("INSERT INTO pg_collation (collname, collnamespace, collowner, collencoding, collcollate, collctype) VALUES ('ucs_basic', 'pg_catalog'::regnamespace, %u, %d, 'C', 'C');\n\n", BOOTSTRAP_SUPERUSERID, PG_UTF8);
+	PG_CMD_PRINTF3("INSERT INTO pg_collation (collname, collnamespace, collowner, collprovider, collencoding, collcollate, collctype) VALUES ('ucs_basic', 'pg_catalog'::regnamespace, %u, '%c', %d, 'C', 'C');\n\n", BOOTSTRAP_SUPERUSERID, COLLPROVIDER_LIBC, PG_UTF8);
 }
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c7876fedd2..032fb95a2a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -12801,8 +12801,10 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	PQExpBuffer delq;
 	PQExpBuffer labelq;
 	PGresult   *res;
+	int			i_collprovider;
 	int			i_collcollate;
 	int			i_collctype;
+	const char *collprovider;
 	const char *collcollate;
 	const char *collctype;
 
@@ -12819,18 +12821,30 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	selectSourceSchema(fout, collinfo->dobj.namespace->dobj.name);
 
 	/* Get collation-specific details */
-	appendPQExpBuffer(query, "SELECT "
-					  "collcollate, "
-					  "collctype "
-					  "FROM pg_catalog.pg_collation c "
-					  "WHERE c.oid = '%u'::pg_catalog.oid",
-					  collinfo->dobj.catId.oid);
+	if (fout->remoteVersion >= 100000)
+		appendPQExpBuffer(query, "SELECT "
+						  "collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
+	else
+		appendPQExpBuffer(query, "SELECT "
+						  "'p'::char AS collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
 
 	res = ExecuteSqlQueryForSingleRow(fout, query->data);
 
+	i_collprovider = PQfnumber(res, "collprovider");
 	i_collcollate = PQfnumber(res, "collcollate");
 	i_collctype = PQfnumber(res, "collctype");
 
+	collprovider = PQgetvalue(res, 0, i_collprovider);
 	collcollate = PQgetvalue(res, 0, i_collcollate);
 	collctype = PQgetvalue(res, 0, i_collctype);
 
@@ -12842,11 +12856,32 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	appendPQExpBuffer(delq, ".%s;\n",
 					  fmtId(collinfo->dobj.name));
 
-	appendPQExpBuffer(q, "CREATE COLLATION %s (lc_collate = ",
+	appendPQExpBuffer(q, "CREATE COLLATION %s (",
 					  fmtId(collinfo->dobj.name));
-	appendStringLiteralAH(q, collcollate, fout);
-	appendPQExpBufferStr(q, ", lc_ctype = ");
-	appendStringLiteralAH(q, collctype, fout);
+
+	appendPQExpBufferStr(q, "provider = ");
+	if (collprovider[0] == 'c')
+		appendStringLiteralAH(q, "libc", fout);
+	else if (collprovider[0] == 'i')
+		appendStringLiteralAH(q, "icu", fout);
+	else
+		exit_horribly(NULL,
+					  "unrecognized collation provider: %s\n",
+					  collprovider);
+
+	if (strcmp(collcollate, collctype) == 0)
+	{
+		appendPQExpBufferStr(q, ", locale = ");
+		appendStringLiteralAH(q, collcollate, fout);
+	}
+	else
+	{
+		appendPQExpBufferStr(q, ", lc_collate = ");
+		appendStringLiteralAH(q, collcollate, fout);
+		appendPQExpBufferStr(q, ", lc_ctype = ");
+		appendStringLiteralAH(q, collctype, fout);
+	}
+
 	appendPQExpBufferStr(q, ");\n");
 
 	appendPQExpBuffer(labelq, "COLLATION %s", fmtId(collinfo->dobj.name));
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index e2e4cbcc08..ed2eeea81b 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3701,7 +3701,7 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false};
 
 	if (pset.sversion < 90100)
 	{
@@ -3725,6 +3725,11 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
 					  gettext_noop("Collate"),
 					  gettext_noop("Ctype"));
 
+	if (pset.sversion >= 100000)
+		appendPQExpBuffer(&buf,
+						  ",\n       CASE c.collprovider WHEN 'd' THEN 'default' WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\"",
+						  gettext_noop("Provider"));
+
 	if (verbose)
 		appendPQExpBuffer(&buf,
 						  ",\n       pg_catalog.obj_description(c.oid, 'pg_collation') AS \"%s\"",
diff --git a/src/include/catalog/pg_collation.h b/src/include/catalog/pg_collation.h
index 30c87e004e..8edd8aa066 100644
--- a/src/include/catalog/pg_collation.h
+++ b/src/include/catalog/pg_collation.h
@@ -34,9 +34,13 @@ CATALOG(pg_collation,3456)
 	NameData	collname;		/* collation name */
 	Oid			collnamespace;	/* OID of namespace containing collation */
 	Oid			collowner;		/* owner of collation */
+	char		collprovider;	/* see constants below */
 	int32		collencoding;	/* encoding for this collation; -1 = "all" */
 	NameData	collcollate;	/* LC_COLLATE setting */
 	NameData	collctype;		/* LC_CTYPE setting */
+#ifdef CATALOG_VARLEN			/* variable-length fields start here */
+	text		collversion;	/* provider-dependent version of collation data */
+#endif
 } FormData_pg_collation;
 
 /* ----------------
@@ -50,27 +54,34 @@ typedef FormData_pg_collation *Form_pg_collation;
  *		compiler constants for pg_collation
  * ----------------
  */
-#define Natts_pg_collation				6
+#define Natts_pg_collation				8
 #define Anum_pg_collation_collname		1
 #define Anum_pg_collation_collnamespace 2
 #define Anum_pg_collation_collowner		3
-#define Anum_pg_collation_collencoding	4
-#define Anum_pg_collation_collcollate	5
-#define Anum_pg_collation_collctype		6
+#define Anum_pg_collation_collprovider	4
+#define Anum_pg_collation_collencoding	5
+#define Anum_pg_collation_collcollate	6
+#define Anum_pg_collation_collctype		7
+#define Anum_pg_collation_collversion	8
 
 /* ----------------
  *		initial contents of pg_collation
  * ----------------
  */
 
-DATA(insert OID = 100 ( default		PGNSP PGUID -1 "" "" ));
+DATA(insert OID = 100 ( default		PGNSP PGUID d -1 "" "" 0 ));
 DESCR("database's default collation");
 #define DEFAULT_COLLATION_OID	100
-DATA(insert OID = 950 ( C			PGNSP PGUID -1 "C" "C" ));
+DATA(insert OID = 950 ( C			PGNSP PGUID c -1 "C" "C" 0 ));
 DESCR("standard C collation");
 #define C_COLLATION_OID			950
-DATA(insert OID = 951 ( POSIX		PGNSP PGUID -1 "POSIX" "POSIX" ));
+DATA(insert OID = 951 ( POSIX		PGNSP PGUID c -1 "POSIX" "POSIX" 0 ));
 DESCR("standard POSIX collation");
 #define POSIX_COLLATION_OID		951
 
+
+#define COLLPROVIDER_DEFAULT	'd'
+#define COLLPROVIDER_ICU		'i'
+#define COLLPROVIDER_LIBC		'c'
+
 #endif   /* PG_COLLATION_H */
diff --git a/src/include/catalog/pg_collation_fn.h b/src/include/catalog/pg_collation_fn.h
index 482ba7920e..2a50e54124 100644
--- a/src/include/catalog/pg_collation_fn.h
+++ b/src/include/catalog/pg_collation_fn.h
@@ -16,6 +16,7 @@
 
 extern Oid CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
 				const char *collcollate, const char *collctype,
 				bool if_not_exists);
diff --git a/src/include/commands/collationcmds.h b/src/include/commands/collationcmds.h
index 3b2fcb8271..df5623ccb6 100644
--- a/src/include/commands/collationcmds.h
+++ b/src/include/commands/collationcmds.h
@@ -20,5 +20,6 @@
 
 extern ObjectAddress DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_exists);
 extern void IsThereCollationInNamespace(const char *collname, Oid nspOid);
+extern ObjectAddress AlterCollation(AlterCollationStmt *stmt);
 
 #endif   /* COLLATIONCMDS_H */
diff --git a/src/include/mb/pg_wchar.h b/src/include/mb/pg_wchar.h
index ceb5695924..5e279b319a 100644
--- a/src/include/mb/pg_wchar.h
+++ b/src/include/mb/pg_wchar.h
@@ -333,6 +333,12 @@ typedef struct pg_enc2gettext
 extern const pg_enc2gettext pg_enc2gettext_tbl[];
 
 /*
+ * Encoding names for ICU
+ */
+extern bool is_encoding_supported_by_icu(int encoding);
+extern const char *get_encoding_name_for_icu(int encoding);
+
+/*
  * pg_wchar stuff
  */
 typedef int (*mb2wchar_with_len_converter) (const unsigned char *from,
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 2bc7a5df11..5d6d05385c 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -423,6 +423,7 @@ typedef enum NodeTag
 	T_CreateSubscriptionStmt,
 	T_AlterSubscriptionStmt,
 	T_DropSubscriptionStmt,
+	T_AlterCollationStmt,
 
 	/*
 	 * TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index a44d2178e1..4a3bbdd822 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1733,6 +1733,17 @@ typedef struct AlterTableCmd	/* one subcommand of an ALTER TABLE */
 
 
 /* ----------------------
+ * Alter Collation
+ * ----------------------
+ */
+typedef struct AlterCollationStmt
+{
+	NodeTag		type;
+	List	   *collname;
+} AlterCollationStmt;
+
+
+/* ----------------------
  *	Alter Domain
  *
  * The fields are used in different ways by the different variants of
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 5bcd8a1160..0ca708bbdf 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -603,6 +603,9 @@
 /* Define to 1 if you have the external array `tzname'. */
 #undef HAVE_TZNAME
 
+/* Define to 1 if you have the `ucol_strcollUTF8' function. */
+#undef HAVE_UCOL_STRCOLLUTF8
+
 /* Define to 1 if you have the <ucred.h> header file. */
 #undef HAVE_UCRED_H
 
@@ -816,6 +819,9 @@
    (--enable-float8-byval) */
 #undef USE_FLOAT8_BYVAL
 
+/* Define to build with ICU support. (--with-icu) */
+#undef USE_ICU
+
 /* Define to 1 to build with LDAP support. (--with-ldap) */
 #undef USE_LDAP
 
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index cb509e2b6b..e5f18f15b2 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -15,6 +15,9 @@
 #if defined(LOCALE_T_IN_XLOCALE) || defined(WCSTOMBS_L_IN_XLOCALE)
 #include <xlocale.h>
 #endif
+#ifdef USE_ICU
+#include <unicode/ucol.h>
+#endif
 
 #include "utils/guc.h"
 
@@ -61,17 +64,36 @@ extern void cache_locale_time(void);
  * We define our own wrapper around locale_t so we can keep the same
  * function signatures for all builds, while not having to create a
  * fake version of the standard type locale_t in the global namespace.
- * The fake version of pg_locale_t can be checked for truth; that's
- * about all it will be needed for.
+ * pg_locale_t is occasionally checked for truth, so make it a pointer.
  */
+struct pg_locale_t
+{
+	char	provider;
+	union
+	{
 #ifdef HAVE_LOCALE_T
-typedef locale_t pg_locale_t;
-#else
-typedef int pg_locale_t;
+		locale_t lt;
+#endif
+#ifdef USE_ICU
+		struct {
+			const char *locale;
+			UCollator *ucol;
+		} icu;
 #endif
+	} info;
+};
+
+typedef struct pg_locale_t *pg_locale_t;
 
 extern pg_locale_t pg_newlocale_from_collation(Oid collid);
 
+extern char *get_system_collation_version(char collprovider, const char *collcollate);
+
+#ifdef USE_ICU
+extern int32_t icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes);
+extern int32_t icu_from_uchar(char **result, UChar *buff_uchar, int32_t len_uchar);
+#endif
+
 /* These functions convert from/to libc's wchar_t, *not* pg_wchar_t */
 #ifdef USE_WIDE_UPPER_LOWER
 extern size_t wchar2char(char *to, const wchar_t *from, size_t tolen,
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index b923ea1420..a747facb9a 100644
--- a/src/test/regress/GNUmakefile
+++ b/src/test/regress/GNUmakefile
@@ -125,6 +125,9 @@ tablespace-setup:
 ##
 
 REGRESS_OPTS = --dlpath=. $(EXTRA_REGRESS_OPTS)
+ifeq ($(with_icu),yes)
+override EXTRA_TESTS := collate.icu $(EXTRA_TESTS)
+endif
 
 check: all tablespace-setup
 	$(pg_regress_check) $(REGRESS_OPTS) --schedule=$(srcdir)/parallel_schedule $(MAXCONNOPT) $(EXTRA_TESTS)
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.icu.out
similarity index 80%
copy from src/test/regress/expected/collate.linux.utf8.out
copy to src/test/regress/expected/collate.icu.out
index 293e78641e..0943e706bb 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.icu.out
@@ -1,54 +1,54 @@
 /*
- * This test is for Linux/glibc systems and assumes that a full set of
- * locales is installed.  It must be run in a database with UTF-8 encoding,
- * because other encodings don't support all the characters used.
+ * This test is for ICU collations.
  */
 SET client_encoding TO UTF8;
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
 CREATE TABLE collate_test1 (
     a int,
-    b text COLLATE "en_US" NOT NULL
+    b text COLLATE "en%icu" NOT NULL
 );
 \d collate_test1
-           Table "public.collate_test1"
+        Table "collate_tests.collate_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
- b      | text    | en_US     | not null | 
+ b      | text    | en%icu    | not null | 
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "ja_JP.eucjp"
+    b text COLLATE "ja_JP.eucjp%icu"
 );
-ERROR:  collation "ja_JP.eucjp" for encoding "UTF8" does not exist
-LINE 3:     b text COLLATE "ja_JP.eucjp"
+ERROR:  collation "ja_JP.eucjp%icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "ja_JP.eucjp%icu"
                    ^
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "foo"
+    b text COLLATE "foo%icu"
 );
-ERROR:  collation "foo" for encoding "UTF8" does not exist
-LINE 3:     b text COLLATE "foo"
+ERROR:  collation "foo%icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "foo%icu"
                    ^
 CREATE TABLE collate_test_fail (
-    a int COLLATE "en_US",
+    a int COLLATE "en%icu",
     b text
 );
 ERROR:  collations are not supported by type integer
-LINE 2:     a int COLLATE "en_US",
+LINE 2:     a int COLLATE "en%icu",
                   ^
 CREATE TABLE collate_test_like (
     LIKE collate_test1
 );
 \d collate_test_like
-         Table "public.collate_test_like"
+      Table "collate_tests.collate_test_like"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
- b      | text    | en_US     | not null | 
+ b      | text    | en%icu    | not null | 
 
 CREATE TABLE collate_test2 (
     a int,
-    b text COLLATE "sv_SE"
+    b text COLLATE "sv%icu"
 );
 CREATE TABLE collate_test3 (
     a int,
@@ -106,12 +106,12 @@ SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
  3 | bbc
 (2 rows)
 
-SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US";
-ERROR:  collation mismatch between explicit collations "C" and "en_US"
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en%icu";
+ERROR:  collation mismatch between explicit collations "C" and "en%icu"
 LINE 1: ...* FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "e...
                                                              ^
-CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE";
-CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE"; -- fails
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv%icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv%icu"; -- fails
 ERROR:  collations are not supported by type integer
 CREATE TABLE collate_test4 (
     a int,
@@ -129,7 +129,7 @@ SELECT a, b FROM collate_test4 ORDER BY b;
 
 CREATE TABLE collate_test5 (
     a int,
-    b testdomain_sv COLLATE "en_US"
+    b testdomain_sv COLLATE "en%icu"
 );
 INSERT INTO collate_test5 SELECT * FROM collate_test1;
 SELECT a, b FROM collate_test5 ORDER BY b;
@@ -206,13 +206,13 @@ SELECT * FROM collate_test3 ORDER BY b;
 (4 rows)
 
 -- constant expression folding
-SELECT 'bbc' COLLATE "en_US" > 'äbc' COLLATE "en_US" AS "true";
+SELECT 'bbc' COLLATE "en%icu" > 'äbc' COLLATE "en%icu" AS "true";
  true 
 ------
  t
 (1 row)
 
-SELECT 'bbc' COLLATE "sv_SE" > 'äbc' COLLATE "sv_SE" AS "false";
+SELECT 'bbc' COLLATE "sv%icu" > 'äbc' COLLATE "sv%icu" AS "false";
  false 
 -------
  f
@@ -221,8 +221,8 @@ SELECT 'bbc' COLLATE "sv_SE" > 'äbc' COLLATE "sv_SE" AS "false";
 -- upper/lower
 CREATE TABLE collate_test10 (
     a int,
-    x text COLLATE "en_US",
-    y text COLLATE "tr_TR"
+    x text COLLATE "en%icu",
+    y text COLLATE "tr%icu"
 );
 INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
 SELECT a, lower(x), lower(y), upper(x), upper(y), initcap(x), initcap(y) FROM collate_test10;
@@ -290,25 +290,25 @@ SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
  4 | ABC
 (4 rows)
 
-SELECT 'Türkiye' COLLATE "en_US" ILIKE '%KI%' AS "true";
+SELECT 'Türkiye' COLLATE "en%icu" ILIKE '%KI%' AS "true";
  true 
 ------
  t
 (1 row)
 
-SELECT 'Türkiye' COLLATE "tr_TR" ILIKE '%KI%' AS "false";
+SELECT 'Türkiye' COLLATE "tr%icu" ILIKE '%KI%' AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ILIKE 'BIT' COLLATE "en_US" AS "false";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "en%icu" AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ILIKE 'BIT' COLLATE "tr_TR" AS "true";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "tr%icu" AS "true";
  true 
 ------
  t
@@ -364,30 +364,62 @@ SELECT * FROM collate_test1 WHERE b ~* 'bc';
  4 | ABC
 (4 rows)
 
-SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en%icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+  b  | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space 
+-----+----------+----------+----------+----------+----------+----------+----------+----------+----------
+ abc | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ABC | t        | t        | f        | f        | t        | t        | t        | f        | f
+ 123 | f        | f        | f        | t        | t        | t        | t        | f        | f
+ ab1 | f        | f        | f        | f        | t        | t        | t        | f        | f
+ a1! | f        | f        | f        | f        | f        | t        | t        | f        | f
+ a c | f        | f        | f        | f        | f        | f        | t        | f        | f
+ !.; | f        | f        | f        | f        | f        | t        | t        | t        | f
+     | f        | f        | f        | f        | f        | f        | t        | f        | t
+ äbç | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ÄBÇ | t        | t        | f        | f        | t        | t        | t        | f        | f
+(10 rows)
+
+SELECT 'Türkiye' COLLATE "en%icu" ~* 'KI' AS "true";
+ true 
+------
+ t
+(1 row)
+
+SELECT 'Türkiye' COLLATE "tr%icu" ~* 'KI' AS "true";  -- true with ICU
  true 
 ------
  t
 (1 row)
 
-SELECT 'Türkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
+SELECT 'bıt' ~* 'BIT' COLLATE "en%icu" AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ~* 'BIT' COLLATE "en_US" AS "false";
+SELECT 'bıt' ~* 'BIT' COLLATE "tr%icu" AS "false";  -- false with ICU
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ~* 'BIT' COLLATE "tr_TR" AS "true";
- true 
-------
- t
-(1 row)
-
 -- The following actually exercises the selectivity estimation for ~*.
 SELECT relname FROM pg_class WHERE relname ~* '^abc';
  relname 
@@ -402,7 +434,7 @@ SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
  01 NIS 2010
 (1 row)
 
-SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr%icu");
    to_char   
 -------------
  01 NİS 2010
@@ -693,7 +725,7 @@ SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- ok
 (8 rows)
 
 SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en%icu" and "C"
 LINE 1: SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collat...
                                                        ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
@@ -707,12 +739,12 @@ SELECT a, b COLLATE "C" FROM collate_test1 UNION SELECT a, b FROM collate_test3
 (4 rows)
 
 SELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en%icu" and "C"
 LINE 1: ...ELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM col...
                                                              ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
 SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en%icu" and "C"
 LINE 1: SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM colla...
                                                         ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
@@ -731,18 +763,18 @@ select x || y from collate_test10; -- ok, because || is not collation aware
 (2 rows)
 
 select x, y from collate_test10 order by x || y; -- not so ok
-ERROR:  collation mismatch between implicit collations "en_US" and "tr_TR"
+ERROR:  collation mismatch between implicit collations "en%icu" and "tr%icu"
 LINE 1: select x, y from collate_test10 order by x || y;
                                                       ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
 -- collation mismatch between recursive and non-recursive term
 WITH RECURSIVE foo(x) AS
-   (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+   (SELECT x FROM (VALUES('a' COLLATE "en%icu"),('b')) t(x)
    UNION ALL
-   SELECT (x || 'c') COLLATE "de_DE" FROM foo WHERE length(x) < 10)
+   SELECT (x || 'c') COLLATE "de%icu" FROM foo WHERE length(x) < 10)
 SELECT * FROM foo;
-ERROR:  recursive query "foo" column 1 has collation "en_US" in non-recursive term but collation "de_DE" overall
-LINE 2:    (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+ERROR:  recursive query "foo" column 1 has collation "en%icu" in non-recursive term but collation "de%icu" overall
+LINE 2:    (SELECT x FROM (VALUES('a' COLLATE "en%icu"),('b')) t(x)
                    ^
 HINT:  Use the COLLATE clause to set the collation of the non-recursive term.
 -- casting
@@ -843,7 +875,7 @@ begin
   return xx < yy;
 end
 $$;
-SELECT mylt2('a', 'B' collate "en_US") as t, mylt2('a', 'B' collate "C") as f;
+SELECT mylt2('a', 'B' collate "en%icu") as t, mylt2('a', 'B' collate "C") as f;
  t | f 
 ---+---
  t | f
@@ -957,29 +989,23 @@ CREATE SCHEMA test_schema;
 -- We need to do this this way to cope with varying names for encodings:
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test0 (locale = ' ||
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
           quote_literal(current_setting('lc_collate')) || ');';
 END
 $$;
 CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
-ERROR:  collation "test0" for encoding "UTF8" already exists
-CREATE COLLATION IF NOT EXISTS test0 FROM "C"; -- ok, skipped
-NOTICE:  collation "test0" for encoding "UTF8" already exists, skipping
-CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
-NOTICE:  collation "test0" for encoding "UTF8" already exists, skipping
+ERROR:  collation "test0" already exists
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
           quote_literal(current_setting('lc_collate')) ||
           ', lc_ctype = ' ||
           quote_literal(current_setting('lc_ctype')) || ');';
 END
 $$;
-CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
 ERROR:  parameter "lc_ctype" must be specified
-CREATE COLLATION testx (locale = 'nonsense'); -- fail
-ERROR:  could not create locale "nonsense": No such file or directory
-DETAIL:  The operating system could not find any locale data for the locale name "nonsense".
+CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails with ICU */  DROP COLLATION testx;
 CREATE COLLATION test4 FROM nonsense;
 ERROR:  collation "nonsense" for encoding "UTF8" does not exist
 CREATE COLLATION test5 FROM test0;
@@ -993,7 +1019,7 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
 
 ALTER COLLATION test1 RENAME TO test11;
 ALTER COLLATION test0 RENAME TO test11; -- fail
-ERROR:  collation "test11" for encoding "UTF8" already exists in schema "public"
+ERROR:  collation "test11" already exists in schema "collate_tests"
 ALTER COLLATION test1 RENAME TO test22; -- fail
 ERROR:  collation "test1" for encoding "UTF8" does not exist
 ALTER COLLATION test11 OWNER TO regress_test_role;
@@ -1005,11 +1031,11 @@ SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
     FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
     WHERE collname LIKE 'test%'
     ORDER BY 1;
- collname |   nspname   | obj_description 
-----------+-------------+-----------------
- test0    | public      | US English
- test11   | test_schema | 
- test5    | public      | 
+ collname |    nspname    | obj_description 
+----------+---------------+-----------------
+ test0    | collate_tests | US English
+ test11   | test_schema   | 
+ test5    | collate_tests | 
 (3 rows)
 
 DROP COLLATION test0, test_schema.test11, test5;
@@ -1024,6 +1050,9 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
 
 DROP SCHEMA test_schema;
 DROP ROLE regress_test_role;
+-- ALTER
+ALTER COLLATION "en%icu" REFRESH VERSION;
+NOTICE:  version has not changed
 -- dependencies
 CREATE COLLATION test0 FROM "C";
 CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
@@ -1048,13 +1077,13 @@ drop cascades to composite type collate_dep_test2 column y
 drop cascades to view collate_dep_test3
 drop cascades to index collate_dep_test4i
 \d collate_dep_test1
-         Table "public.collate_dep_test1"
+      Table "collate_tests.collate_dep_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
 
 \d collate_dep_test2
-     Composite type "public.collate_dep_test2"
+ Composite type "collate_tests.collate_dep_test2"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  x      | integer |           |          | 
@@ -1063,7 +1092,7 @@ DROP TABLE collate_dep_test1, collate_dep_test4t;
 DROP TYPE collate_dep_test2;
 -- test range types and collations
 create type textrange_c as range(subtype=text, collation="C");
-create type textrange_en_us as range(subtype=text, collation="en_US");
+create type textrange_en_us as range(subtype=text, collation="en%icu");
 select textrange_c('A','Z') @> 'b'::text;
  ?column? 
 ----------
@@ -1078,3 +1107,24 @@ select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
+NOTICE:  drop cascades to 18 other objects
+DETAIL:  drop cascades to table collate_test1
+drop cascades to table collate_test_like
+drop cascades to table collate_test2
+drop cascades to table collate_test3
+drop cascades to type testdomain_sv
+drop cascades to table collate_test4
+drop cascades to table collate_test5
+drop cascades to table collate_test10
+drop cascades to table collate_test6
+drop cascades to view collview1
+drop cascades to view collview2
+drop cascades to view collview3
+drop cascades to type testdomain
+drop cascades to function mylt(text,text)
+drop cascades to function mylt_noninline(text,text)
+drop cascades to function mylt_plpgsql(text,text)
+drop cascades to function mylt2(text,text)
+drop cascades to function dup(anyelement)
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out
index 293e78641e..332fb4dbbd 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.linux.utf8.out
@@ -4,12 +4,14 @@
  * because other encodings don't support all the characters used.
  */
 SET client_encoding TO UTF8;
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
 CREATE TABLE collate_test1 (
     a int,
     b text COLLATE "en_US" NOT NULL
 );
 \d collate_test1
-           Table "public.collate_test1"
+        Table "collate_tests.collate_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
@@ -40,7 +42,7 @@ CREATE TABLE collate_test_like (
     LIKE collate_test1
 );
 \d collate_test_like
-         Table "public.collate_test_like"
+      Table "collate_tests.collate_test_like"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
@@ -364,6 +366,38 @@ SELECT * FROM collate_test1 WHERE b ~* 'bc';
  4 | ABC
 (4 rows)
 
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+  b  | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space 
+-----+----------+----------+----------+----------+----------+----------+----------+----------+----------
+ abc | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ABC | t        | t        | f        | f        | t        | t        | t        | f        | f
+ 123 | f        | f        | f        | t        | t        | t        | t        | f        | f
+ ab1 | f        | f        | f        | f        | t        | t        | t        | f        | f
+ a1! | f        | f        | f        | f        | f        | t        | t        | f        | f
+ a c | f        | f        | f        | f        | f        | f        | t        | f        | f
+ !.; | f        | f        | f        | f        | f        | t        | t        | t        | f
+     | f        | f        | f        | f        | f        | f        | t        | f        | t
+ äbç | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ÄBÇ | t        | t        | f        | f        | t        | t        | t        | f        | f
+(10 rows)
+
 SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
  true 
 ------
@@ -993,7 +1027,7 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
 
 ALTER COLLATION test1 RENAME TO test11;
 ALTER COLLATION test0 RENAME TO test11; -- fail
-ERROR:  collation "test11" for encoding "UTF8" already exists in schema "public"
+ERROR:  collation "test11" for encoding "UTF8" already exists in schema "collate_tests"
 ALTER COLLATION test1 RENAME TO test22; -- fail
 ERROR:  collation "test1" for encoding "UTF8" does not exist
 ALTER COLLATION test11 OWNER TO regress_test_role;
@@ -1005,11 +1039,11 @@ SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
     FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
     WHERE collname LIKE 'test%'
     ORDER BY 1;
- collname |   nspname   | obj_description 
-----------+-------------+-----------------
- test0    | public      | US English
- test11   | test_schema | 
- test5    | public      | 
+ collname |    nspname    | obj_description 
+----------+---------------+-----------------
+ test0    | collate_tests | US English
+ test11   | test_schema   | 
+ test5    | collate_tests | 
 (3 rows)
 
 DROP COLLATION test0, test_schema.test11, test5;
@@ -1024,6 +1058,9 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
 
 DROP SCHEMA test_schema;
 DROP ROLE regress_test_role;
+-- ALTER
+ALTER COLLATION "en_US" REFRESH VERSION;
+NOTICE:  version has not changed
 -- dependencies
 CREATE COLLATION test0 FROM "C";
 CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
@@ -1048,13 +1085,13 @@ drop cascades to composite type collate_dep_test2 column y
 drop cascades to view collate_dep_test3
 drop cascades to index collate_dep_test4i
 \d collate_dep_test1
-         Table "public.collate_dep_test1"
+      Table "collate_tests.collate_dep_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
 
 \d collate_dep_test2
-     Composite type "public.collate_dep_test2"
+ Composite type "collate_tests.collate_dep_test2"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  x      | integer |           |          | 
@@ -1078,3 +1115,24 @@ select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
+NOTICE:  drop cascades to 18 other objects
+DETAIL:  drop cascades to table collate_test1
+drop cascades to table collate_test_like
+drop cascades to table collate_test2
+drop cascades to table collate_test3
+drop cascades to type testdomain_sv
+drop cascades to table collate_test4
+drop cascades to table collate_test5
+drop cascades to table collate_test10
+drop cascades to table collate_test6
+drop cascades to view collview1
+drop cascades to view collview2
+drop cascades to view collview3
+drop cascades to type testdomain
+drop cascades to function mylt(text,text)
+drop cascades to function mylt_noninline(text,text)
+drop cascades to function mylt_plpgsql(text,text)
+drop cascades to function mylt2(text,text)
+drop cascades to function dup(anyelement)
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.icu.sql
similarity index 82%
copy from src/test/regress/sql/collate.linux.utf8.sql
copy to src/test/regress/sql/collate.icu.sql
index c349cbde2b..cf092730ec 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.icu.sql
@@ -1,31 +1,32 @@
 /*
- * This test is for Linux/glibc systems and assumes that a full set of
- * locales is installed.  It must be run in a database with UTF-8 encoding,
- * because other encodings don't support all the characters used.
+ * This test is for ICU collations.
  */
 
 SET client_encoding TO UTF8;
 
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
+
 
 CREATE TABLE collate_test1 (
     a int,
-    b text COLLATE "en_US" NOT NULL
+    b text COLLATE "en%icu" NOT NULL
 );
 
 \d collate_test1
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "ja_JP.eucjp"
+    b text COLLATE "ja_JP.eucjp%icu"
 );
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "foo"
+    b text COLLATE "foo%icu"
 );
 
 CREATE TABLE collate_test_fail (
-    a int COLLATE "en_US",
+    a int COLLATE "en%icu",
     b text
 );
 
@@ -37,7 +38,7 @@ CREATE TABLE collate_test_like (
 
 CREATE TABLE collate_test2 (
     a int,
-    b text COLLATE "sv_SE"
+    b text COLLATE "sv%icu"
 );
 
 CREATE TABLE collate_test3 (
@@ -57,11 +58,11 @@ CREATE TABLE collate_test3 (
 SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
 SELECT * FROM collate_test1 WHERE b >= 'bbc' COLLATE "C";
 SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
-SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US";
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en%icu";
 
 
-CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE";
-CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE"; -- fails
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv%icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv%icu"; -- fails
 CREATE TABLE collate_test4 (
     a int,
     b testdomain_sv
@@ -71,7 +72,7 @@ CREATE TABLE collate_test4 (
 
 CREATE TABLE collate_test5 (
     a int,
-    b testdomain_sv COLLATE "en_US"
+    b testdomain_sv COLLATE "en%icu"
 );
 INSERT INTO collate_test5 SELECT * FROM collate_test1;
 SELECT a, b FROM collate_test5 ORDER BY b;
@@ -89,15 +90,15 @@ CREATE TABLE collate_test5 (
 SELECT * FROM collate_test3 ORDER BY b;
 
 -- constant expression folding
-SELECT 'bbc' COLLATE "en_US" > 'äbc' COLLATE "en_US" AS "true";
-SELECT 'bbc' COLLATE "sv_SE" > 'äbc' COLLATE "sv_SE" AS "false";
+SELECT 'bbc' COLLATE "en%icu" > 'äbc' COLLATE "en%icu" AS "true";
+SELECT 'bbc' COLLATE "sv%icu" > 'äbc' COLLATE "sv%icu" AS "false";
 
 -- upper/lower
 
 CREATE TABLE collate_test10 (
     a int,
-    x text COLLATE "en_US",
-    y text COLLATE "tr_TR"
+    x text COLLATE "en%icu",
+    y text COLLATE "tr%icu"
 );
 
 INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
@@ -116,11 +117,11 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';
 SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
 
-SELECT 'Türkiye' COLLATE "en_US" ILIKE '%KI%' AS "true";
-SELECT 'Türkiye' COLLATE "tr_TR" ILIKE '%KI%' AS "false";
+SELECT 'Türkiye' COLLATE "en%icu" ILIKE '%KI%' AS "true";
+SELECT 'Türkiye' COLLATE "tr%icu" ILIKE '%KI%' AS "false";
 
-SELECT 'bıt' ILIKE 'BIT' COLLATE "en_US" AS "false";
-SELECT 'bıt' ILIKE 'BIT' COLLATE "tr_TR" AS "true";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "en%icu" AS "false";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "tr%icu" AS "true";
 
 -- The following actually exercises the selectivity estimation for ILIKE.
 SELECT relname FROM pg_class WHERE relname ILIKE 'abc%';
@@ -134,11 +135,30 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ~* '^abc';
 SELECT * FROM collate_test1 WHERE b ~* 'bc';
 
-SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
-SELECT 'Türkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
-
-SELECT 'bıt' ~* 'BIT' COLLATE "en_US" AS "false";
-SELECT 'bıt' ~* 'BIT' COLLATE "tr_TR" AS "true";
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en%icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+
+SELECT 'Türkiye' COLLATE "en%icu" ~* 'KI' AS "true";
+SELECT 'Türkiye' COLLATE "tr%icu" ~* 'KI' AS "true";  -- true with ICU
+
+SELECT 'bıt' ~* 'BIT' COLLATE "en%icu" AS "false";
+SELECT 'bıt' ~* 'BIT' COLLATE "tr%icu" AS "false";  -- false with ICU
 
 -- The following actually exercises the selectivity estimation for ~*.
 SELECT relname FROM pg_class WHERE relname ~* '^abc';
@@ -148,7 +168,7 @@ CREATE TABLE collate_test10 (
 
 SET lc_time TO 'tr_TR';
 SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
-SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr%icu");
 
 
 -- backwards parsing
@@ -218,9 +238,9 @@ CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM
 
 -- collation mismatch between recursive and non-recursive term
 WITH RECURSIVE foo(x) AS
-   (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+   (SELECT x FROM (VALUES('a' COLLATE "en%icu"),('b')) t(x)
    UNION ALL
-   SELECT (x || 'c') COLLATE "de_DE" FROM foo WHERE length(x) < 10)
+   SELECT (x || 'c') COLLATE "de%icu" FROM foo WHERE length(x) < 10)
 SELECT * FROM foo;
 
 
@@ -268,7 +288,7 @@ CREATE FUNCTION mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
 end
 $$;
 
-SELECT mylt2('a', 'B' collate "en_US") as t, mylt2('a', 'B' collate "C") as f;
+SELECT mylt2('a', 'B' collate "en%icu") as t, mylt2('a', 'B' collate "C") as f;
 
 CREATE OR REPLACE FUNCTION
   mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
@@ -320,23 +340,21 @@ CREATE SCHEMA test_schema;
 -- We need to do this this way to cope with varying names for encodings:
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test0 (locale = ' ||
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
           quote_literal(current_setting('lc_collate')) || ');';
 END
 $$;
 CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
-CREATE COLLATION IF NOT EXISTS test0 FROM "C"; -- ok, skipped
-CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
           quote_literal(current_setting('lc_collate')) ||
           ', lc_ctype = ' ||
           quote_literal(current_setting('lc_ctype')) || ');';
 END
 $$;
-CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
-CREATE COLLATION testx (locale = 'nonsense'); -- fail
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails with ICU */  DROP COLLATION testx;
 
 CREATE COLLATION test4 FROM nonsense;
 CREATE COLLATION test5 FROM test0;
@@ -368,6 +386,11 @@ CREATE COLLATION test5 FROM test0;
 DROP ROLE regress_test_role;
 
 
+-- ALTER
+
+ALTER COLLATION "en%icu" REFRESH VERSION;
+
+
 -- dependencies
 
 CREATE COLLATION test0 FROM "C";
@@ -391,10 +414,14 @@ CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
 -- test range types and collations
 
 create type textrange_c as range(subtype=text, collation="C");
-create type textrange_en_us as range(subtype=text, collation="en_US");
+create type textrange_en_us as range(subtype=text, collation="en%icu");
 
 select textrange_c('A','Z') @> 'b'::text;
 select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+
+
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.linux.utf8.sql
index c349cbde2b..78c7ac2e44 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.linux.utf8.sql
@@ -6,6 +6,9 @@
 
 SET client_encoding TO UTF8;
 
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
+
 
 CREATE TABLE collate_test1 (
     a int,
@@ -134,6 +137,25 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ~* '^abc';
 SELECT * FROM collate_test1 WHERE b ~* 'bc';
 
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+
 SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
 SELECT 'Türkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
 
@@ -368,6 +390,11 @@ CREATE COLLATION test5 FROM test0;
 DROP ROLE regress_test_role;
 
 
+-- ALTER
+
+ALTER COLLATION "en_US" REFRESH VERSION;
+
+
 -- dependencies
 
 CREATE COLLATION test0 FROM "C";
@@ -398,3 +425,7 @@ CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
 
 drop type textrange_c;
 drop type textrange_en_us;
+
+
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
-- 
2.12.0

#73

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Peter Eisentraut (#72)

Re: ICU integration

On 03/09/2017 10:13 PM, Peter Eisentraut wrote:

- Naming of collations: Are we happy with the "de%icu" naming? I might
have come up with that while reviewing the IPv6 zone index patch. ;-)
An alternative might be "de$icu" for more Oracle vibe and avoiding the
need for double quotes in some cases. (But we have mixed-case names
like "de_AT%icu", so further changes would be necessary to fully get rid
of the need for quoting.) A more radical alternative would be to
install ICU locales in a different schema and use the search_path, or
even have a separate search path setting for collations only. Which
leads to ...

I do not like the schema based solution since search_path already gives
us enough headaches. As for the naming I am fine with the current scheme.

Would be nice with something we did not have to quote, but I do not like
using dollar signs since they are already use for other things.

- Selecting default collation provider: Maybe we want a setting, say in
initdb, to determine which provider's collations get the "good" names?
Maybe not necessary for this release, but something to think about.

This does not seem necessary for the initial release.

- Currently (in this patch), we check a collation's version when it is
first used. But, say, after pg_upgrade, you might want to check all of
them right away. What might be a good interface for that? (Possibly,
we only have to check the ones actually in use, and we have dependency
information for that.)

How about adding a SQL level function for checking this which can be
called by pg_upgrade?

= Review

Here is an initial review. I will try to find some time to do more
testing later this week.

This is a really useful feature given the poor quality of collation
support libc. Just that ICU versions the encodings is huge, and the
larger range of available collations with high configurability.
Additionally being able to use abbreviated keys again would be huge.

ICU also supports writing your own collations and modifying existing
ones, something which might be possible to expose in the future. In
general ICU offers a lot for users who care about the details of text
collation.

== Functional

- Generally things seem to work fine and as expected.

- I get a test failures in the default test suite due to not having the
tr_TR locale installed. I would assume that this would be pretty common
for hackers.

***************
*** 428,443 ****

-- to_char
SET lc_time TO 'tr_TR';
SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
to_char
-------------
! 01 NIS 2010
(1 row)

SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr%icu");
to_char
-------------
! 01 NİS 2010
(1 row)

   -- backwards parsing
--- 428,444 ----

   -- to_char
   SET lc_time TO 'tr_TR';
+ ERROR:  invalid value for parameter "lc_time": "tr_TR"
   SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
      to_char
   -------------
!  01 APR 2010
   (1 row)

SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr%icu");
to_char
-------------
! 01 APR 2010
(1 row)

-- backwards parsing

======================================================================

- The code no longer compiles without HAVE_LOCALE_T.

- I do not like how it is not obvious which is the default version of
every locale. E.g. I believe that "sv%icu" is "sv_reformed%icu" and not
"sv_standard%icu" as one might expect. Is this inherent to ICU or
something we can work around?

- ICU talks about a new syntax for locale extensions (old:
de_DE@collation=phonebook, new: de_DE_u_co_phonebk) at this page
http://userguide.icu-project.org/collation/api. Is this something we
need to car about? It looks like we currently use the old format, and
while I personally prefer it I am not sure we should rely on an old syntax.

- I get an error when creating a new locale.

#CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR: could not create locale "sv": Success

# CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR: could not create locale "sv": Resource temporarily unavailable
Time: 1.109 ms

== Code review

- For the collprovider is it really correct to say that 'd' is the
default value as it does in catalogs.sgml?

- I do not know what the policy for formatting the documentation is, but
some of the paragraphs are in need of re-wrapping.

- Add a hint to "ERROR: conflicting or redundant options". The error
message is pretty unclear.

- I am not a fan of this patch comparing the encoding with a -1 literal.
How about adding -1 as a value to the enum? See the example below for a
place where the patch compares with -1.

             ereport(NOTICE,
                 (errcode(ERRCODE_DUPLICATE_OBJECT),
-                errmsg("collation \"%s\" for encoding \"%s\" already 
exists, skipping",
-                       collname, pg_encoding_to_char(collencoding))));
+                collencoding == -1
+                ? errmsg("collation \"%s\" already exists, skipping",
+                         collname)
+                : errmsg("collation \"%s\" for encoding \"%s\" already 
exists, skipping",
+                         collname, pg_encoding_to_char(collencoding))));
             return InvalidOid;

- The patch adds "FIXME" in the below. Is that a left-over from
development or something which still needs doing?

     /*
      * Also forbid matching an any-encoding entry.  This test of course 
is not
      * backed up by the unique index, but it's not a problem since we don't
-    * support adding any-encoding entries after initdb.
+    * support adding any-encoding entries after initdb. FIXME
      */

- Should functions like normalize_locale_name() be renamed to indicate
they relate to libc locales? I am leaning towards doing so but have not
looked closely at the task.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#74

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Andreas Karlsson (#73)

1 attachment(s)

Re: ICU integration

Updated patch attached.

On 3/14/17 22:15, Andreas Karlsson wrote:

I do not like the schema based solution since search_path already gives
us enough headaches. As for the naming I am fine with the current scheme.

Yeah, it seems we're going to settle on the suffix idea. See below for
an idea that builds on that.

- I get a test failures in the default test suite due to not having the
tr_TR locale installed. I would assume that this would be pretty common
for hackers.

I have removed that test. It seems it's not possible to test that
portably without major contortions.

- The code no longer compiles without HAVE_LOCALE_T.

Fixed that.

- I do not like how it is not obvious which is the default version of
every locale. E.g. I believe that "sv%icu" is "sv_reformed%icu" and not
"sv_standard%icu" as one might expect. Is this inherent to ICU or
something we can work around?

We get these keywords from ucol_getKeywordValuesForLocale(), which says
"Given a key and a locale, returns an array of string values in a
preferred order that would make a difference." So all those are
supposedly different from each other.

- ICU talks about a new syntax for locale extensions (old:
de_DE@collation=phonebook, new: de_DE_u_co_phonebk) at this page
http://userguide.icu-project.org/collation/api. Is this something we
need to car about? It looks like we currently use the old format, and
while I personally prefer it I am not sure we should rely on an old syntax.

Interesting. I hadn't see this before, and the documentation is sparse.
But it seems that this refers to BCP 47 language tags, which seem like
a good idea.

So what I have done is change all the predefined ICU collations to be
named after the BCP 47 scheme, with a "private use" -x-icu suffix
(instead of %icu). The preserves the original idea but uses a standard
naming scheme.

I'm not terribly worried that they are going to remove the "old" locale
naming, but just to be forward looking, I have changed it so that the
collcollate entries are made using the "new" naming for ICU >=54.

- I get an error when creating a new locale.

#CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR: could not create locale "sv": Success

# CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR: could not create locale "sv": Resource temporarily unavailable
Time: 1.109 ms

Hmm, that's pretty straightforward code. What is your operating system?
What are the build options? Does it work without this patch?

- For the collprovider is it really correct to say that 'd' is the
default value as it does in catalogs.sgml?

It doesn't say it's the default value, it says it uses the database
default. This is all a bit confusing. We have a collation object named
"default", which uses the locale set for the database. That's been that
way for a while. Now when introducing the collation providers, that
"default" collation gets its own collprovider category 'd'. That is not
really used anywhere, but we have to put something there.

- I do not know what the policy for formatting the documentation is, but
some of the paragraphs are in need of re-wrapping.

Hmm, I don't see anything terribly bad.

- Add a hint to "ERROR: conflicting or redundant options". The error
message is pretty unclear.

I don't see that in my patch. Example?

- I am not a fan of this patch comparing the encoding with a -1 literal.
How about adding -1 as a value to the enum? See the example below for a
place where the patch compares with -1.

That's existing practice. Not a great practice, probably, but widespread.

- The patch adds "FIXME" in the below. Is that a left-over from
development or something which still needs doing?

/*
* Also forbid matching an any-encoding entry.  This test of course 
is not
* backed up by the unique index, but it's not a problem since we don't
-    * support adding any-encoding entries after initdb.
+    * support adding any-encoding entries after initdb. FIXME
*/

I had mentioned that upthread. It technically needs "doing" as you say,
but it's not clear how and it's not terribly important, arguably.

- Should functions like normalize_locale_name() be renamed to indicate
they relate to libc locales? I am leaning towards doing so but have not
looked closely at the task.

Renamed.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v6-0001-ICU-support.patchapplication/x-patch; name=v6-0001-ICU-support.patchDownload

From 28125318030fc420c1c0fc043bd784a8a2bce501 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Thu, 16 Feb 2017 00:06:16 -0500
Subject: [PATCH v6] ICU support

Add a column collprovider to pg_collation that determines which library
provides the collation data.  The existing choices are default and libc,
and this adds an icu choice, which uses the ICU4C library.

The pg_locale_t type is changed to a union that contains the
provider-specific locale handles.  Users of locale information are
changed to look into that struct for the appropriate handle to use.

Also add a collversion column that records the version of the collation
when it is created, and check at run time whether it is still the same.
This detects potentially incompatible library upgrades that can corrupt
indexes and other structures.  This is currently only supported by
ICU-provided collations.

initdb initializes the default collation set as before from the `locale
-a` output but also adds all available ICU locales with a "-x-icu"
appended.

Currently, ICU-provided collations can only be explicitly named
collations.  The global database locales are still always libc-provided.
---
 aclocal.m4                                         |   1 +
 config/pkg.m4                                      | 275 +++++++++++++
 configure                                          | 313 ++++++++++++++
 configure.in                                       |  35 ++
 doc/src/sgml/catalogs.sgml                         |  19 +
 doc/src/sgml/charset.sgml                          | 175 +++++++-
 doc/src/sgml/installation.sgml                     |  14 +
 doc/src/sgml/ref/alter_collation.sgml              |  42 ++
 doc/src/sgml/ref/create_collation.sgml             |  16 +-
 src/Makefile.global.in                             |   4 +
 src/backend/Makefile                               |   2 +-
 src/backend/catalog/pg_collation.c                 |  26 +-
 src/backend/commands/collationcmds.c               | 243 ++++++++++-
 src/backend/common.mk                              |   2 +
 src/backend/nodes/copyfuncs.c                      |  13 +
 src/backend/nodes/equalfuncs.c                     |  11 +
 src/backend/parser/gram.y                          |  18 +-
 src/backend/regex/regc_pg_locale.c                 | 110 +++--
 src/backend/tcop/utility.c                         |   8 +
 src/backend/utils/adt/formatting.c                 | 453 +++++++++++----------
 src/backend/utils/adt/like.c                       |  53 ++-
 src/backend/utils/adt/pg_locale.c                  | 259 ++++++++++--
 src/backend/utils/adt/selfuncs.c                   |   8 +-
 src/backend/utils/adt/varlena.c                    | 179 ++++++--
 src/backend/utils/mb/encnames.c                    |  76 ++++
 src/bin/initdb/initdb.c                            |   3 +-
 src/bin/pg_dump/pg_dump.c                          |  55 ++-
 src/bin/psql/describe.c                            |   7 +-
 src/include/catalog/pg_collation.h                 |  25 +-
 src/include/catalog/pg_collation_fn.h              |   1 +
 src/include/commands/collationcmds.h               |   1 +
 src/include/mb/pg_wchar.h                          |   6 +
 src/include/nodes/nodes.h                          |   1 +
 src/include/nodes/parsenodes.h                     |  11 +
 src/include/pg_config.h.in                         |   6 +
 src/include/utils/pg_locale.h                      |  32 +-
 src/test/regress/GNUmakefile                       |   3 +
 .../{collate.linux.utf8.out => collate.icu.out}    | 201 +++++----
 src/test/regress/expected/collate.linux.utf8.out   |  78 +++-
 .../{collate.linux.utf8.sql => collate.icu.sql}    | 101 +++--
 src/test/regress/sql/collate.linux.utf8.sql        |  31 ++
 41 files changed, 2406 insertions(+), 511 deletions(-)
 create mode 100644 config/pkg.m4
 copy src/test/regress/expected/{collate.linux.utf8.out => collate.icu.out} (80%)
 copy src/test/regress/sql/{collate.linux.utf8.sql => collate.icu.sql} (81%)

diff --git a/aclocal.m4 b/aclocal.m4
index 6f930b6fc1..5ca902b6a2 100644
--- a/aclocal.m4
+++ b/aclocal.m4
@@ -7,6 +7,7 @@ m4_include([config/docbook.m4])
 m4_include([config/general.m4])
 m4_include([config/libtool.m4])
 m4_include([config/perl.m4])
+m4_include([config/pkg.m4])
 m4_include([config/programs.m4])
 m4_include([config/python.m4])
 m4_include([config/tcl.m4])
diff --git a/config/pkg.m4 b/config/pkg.m4
new file mode 100644
index 0000000000..82bea96ee7
--- /dev/null
+++ b/config/pkg.m4
@@ -0,0 +1,275 @@
+dnl pkg.m4 - Macros to locate and utilise pkg-config.   -*- Autoconf -*-
+dnl serial 11 (pkg-config-0.29.1)
+dnl
+dnl Copyright © 2004 Scott James Remnant <scott@netsplit.com>.
+dnl Copyright © 2012-2015 Dan Nicholson <dbn.lists@gmail.com>
+dnl
+dnl This program is free software; you can redistribute it and/or modify
+dnl it under the terms of the GNU General Public License as published by
+dnl the Free Software Foundation; either version 2 of the License, or
+dnl (at your option) any later version.
+dnl
+dnl This program is distributed in the hope that it will be useful, but
+dnl WITHOUT ANY WARRANTY; without even the implied warranty of
+dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+dnl General Public License for more details.
+dnl
+dnl You should have received a copy of the GNU General Public License
+dnl along with this program; if not, write to the Free Software
+dnl Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+dnl 02111-1307, USA.
+dnl
+dnl As a special exception to the GNU General Public License, if you
+dnl distribute this file as part of a program that contains a
+dnl configuration script generated by Autoconf, you may include it under
+dnl the same distribution terms that you use for the rest of that
+dnl program.
+
+dnl PKG_PREREQ(MIN-VERSION)
+dnl -----------------------
+dnl Since: 0.29
+dnl
+dnl Verify that the version of the pkg-config macros are at least
+dnl MIN-VERSION. Unlike PKG_PROG_PKG_CONFIG, which checks the user's
+dnl installed version of pkg-config, this checks the developer's version
+dnl of pkg.m4 when generating configure.
+dnl
+dnl To ensure that this macro is defined, also add:
+dnl m4_ifndef([PKG_PREREQ],
+dnl     [m4_fatal([must install pkg-config 0.29 or later before running autoconf/autogen])])
+dnl
+dnl See the "Since" comment for each macro you use to see what version
+dnl of the macros you require.
+m4_defun([PKG_PREREQ],
+[m4_define([PKG_MACROS_VERSION], [0.29.1])
+m4_if(m4_version_compare(PKG_MACROS_VERSION, [$1]), -1,
+    [m4_fatal([pkg.m4 version $1 or higher is required but ]PKG_MACROS_VERSION[ found])])
+])dnl PKG_PREREQ
+
+dnl PKG_PROG_PKG_CONFIG([MIN-VERSION])
+dnl ----------------------------------
+dnl Since: 0.16
+dnl
+dnl Search for the pkg-config tool and set the PKG_CONFIG variable to
+dnl first found in the path. Checks that the version of pkg-config found
+dnl is at least MIN-VERSION. If MIN-VERSION is not specified, 0.9.0 is
+dnl used since that's the first version where most current features of
+dnl pkg-config existed.
+AC_DEFUN([PKG_PROG_PKG_CONFIG],
+[m4_pattern_forbid([^_?PKG_[A-Z_]+$])
+m4_pattern_allow([^PKG_CONFIG(_(PATH|LIBDIR|SYSROOT_DIR|ALLOW_SYSTEM_(CFLAGS|LIBS)))?$])
+m4_pattern_allow([^PKG_CONFIG_(DISABLE_UNINSTALLED|TOP_BUILD_DIR|DEBUG_SPEW)$])
+AC_ARG_VAR([PKG_CONFIG], [path to pkg-config utility])
+AC_ARG_VAR([PKG_CONFIG_PATH], [directories to add to pkg-config's search path])
+AC_ARG_VAR([PKG_CONFIG_LIBDIR], [path overriding pkg-config's built-in search path])
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	AC_PATH_TOOL([PKG_CONFIG], [pkg-config])
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=m4_default([$1], [0.9.0])
+	AC_MSG_CHECKING([pkg-config is at least version $_pkg_min_version])
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		AC_MSG_RESULT([yes])
+	else
+		AC_MSG_RESULT([no])
+		PKG_CONFIG=""
+	fi
+fi[]dnl
+])dnl PKG_PROG_PKG_CONFIG
+
+dnl PKG_CHECK_EXISTS(MODULES, [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------------------------------
+dnl Since: 0.18
+dnl
+dnl Check to see whether a particular set of modules exists. Similar to
+dnl PKG_CHECK_MODULES(), but does not set variables or print errors.
+dnl
+dnl Please remember that m4 expands AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+dnl only at the first occurence in configure.ac, so if the first place
+dnl it's called might be skipped (such as if it is within an "if", you
+dnl have to call PKG_CHECK_EXISTS manually
+AC_DEFUN([PKG_CHECK_EXISTS],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+if test -n "$PKG_CONFIG" && \
+    AC_RUN_LOG([$PKG_CONFIG --exists --print-errors "$1"]); then
+  m4_default([$2], [:])
+m4_ifvaln([$3], [else
+  $3])dnl
+fi])
+
+dnl _PKG_CONFIG([VARIABLE], [COMMAND], [MODULES])
+dnl ---------------------------------------------
+dnl Internal wrapper calling pkg-config via PKG_CONFIG and setting
+dnl pkg_failed based on the result.
+m4_define([_PKG_CONFIG],
+[if test -n "$$1"; then
+    pkg_cv_[]$1="$$1"
+ elif test -n "$PKG_CONFIG"; then
+    PKG_CHECK_EXISTS([$3],
+                     [pkg_cv_[]$1=`$PKG_CONFIG --[]$2 "$3" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes ],
+		     [pkg_failed=yes])
+ else
+    pkg_failed=untried
+fi[]dnl
+])dnl _PKG_CONFIG
+
+dnl _PKG_SHORT_ERRORS_SUPPORTED
+dnl ---------------------------
+dnl Internal check to see if pkg-config supports short errors.
+AC_DEFUN([_PKG_SHORT_ERRORS_SUPPORTED],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi[]dnl
+])dnl _PKG_SHORT_ERRORS_SUPPORTED
+
+
+dnl PKG_CHECK_MODULES(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl --------------------------------------------------------------
+dnl Since: 0.4.0
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES might not happen, you should be sure to include an
+dnl explicit call to PKG_PROG_PKG_CONFIG in your configure.ac
+AC_DEFUN([PKG_CHECK_MODULES],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1][_CFLAGS], [C compiler flags for $1, overriding pkg-config])dnl
+AC_ARG_VAR([$1][_LIBS], [linker flags for $1, overriding pkg-config])dnl
+
+pkg_failed=no
+AC_MSG_CHECKING([for $1])
+
+_PKG_CONFIG([$1][_CFLAGS], [cflags], [$2])
+_PKG_CONFIG([$1][_LIBS], [libs], [$2])
+
+m4_define([_PKG_TEXT], [Alternatively, you may set the environment variables $1[]_CFLAGS
+and $1[]_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.])
+
+if test $pkg_failed = yes; then
+   	AC_MSG_RESULT([no])
+        _PKG_SHORT_ERRORS_SUPPORTED
+        if test $_pkg_short_errors_supported = yes; then
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "$2" 2>&1`
+        else 
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "$2" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$$1[]_PKG_ERRORS" >&AS_MESSAGE_LOG_FD
+
+	m4_default([$4], [AC_MSG_ERROR(
+[Package requirements ($2) were not met:
+
+$$1_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+_PKG_TEXT])[]dnl
+        ])
+elif test $pkg_failed = untried; then
+     	AC_MSG_RESULT([no])
+	m4_default([$4], [AC_MSG_FAILURE(
+[The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+_PKG_TEXT
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.])[]dnl
+        ])
+else
+	$1[]_CFLAGS=$pkg_cv_[]$1[]_CFLAGS
+	$1[]_LIBS=$pkg_cv_[]$1[]_LIBS
+        AC_MSG_RESULT([yes])
+	$3
+fi[]dnl
+])dnl PKG_CHECK_MODULES
+
+
+dnl PKG_CHECK_MODULES_STATIC(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl ---------------------------------------------------------------------
+dnl Since: 0.29
+dnl
+dnl Checks for existence of MODULES and gathers its build flags with
+dnl static libraries enabled. Sets VARIABLE-PREFIX_CFLAGS from --cflags
+dnl and VARIABLE-PREFIX_LIBS from --libs.
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES_STATIC might not happen, you should be sure to
+dnl include an explicit call to PKG_PROG_PKG_CONFIG in your
+dnl configure.ac.
+AC_DEFUN([PKG_CHECK_MODULES_STATIC],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+_save_PKG_CONFIG=$PKG_CONFIG
+PKG_CONFIG="$PKG_CONFIG --static"
+PKG_CHECK_MODULES($@)
+PKG_CONFIG=$_save_PKG_CONFIG[]dnl
+])dnl PKG_CHECK_MODULES_STATIC
+
+
+dnl PKG_INSTALLDIR([DIRECTORY])
+dnl -------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable pkgconfigdir as the location where a module
+dnl should install pkg-config .pc files. By default the directory is
+dnl $libdir/pkgconfig, but the default can be changed by passing
+dnl DIRECTORY. The user can override through the --with-pkgconfigdir
+dnl parameter.
+AC_DEFUN([PKG_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${libdir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([pkgconfigdir],
+    [AS_HELP_STRING([--with-pkgconfigdir], pkg_description)],,
+    [with_pkgconfigdir=]pkg_default)
+AC_SUBST([pkgconfigdir], [$with_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_INSTALLDIR
+
+
+dnl PKG_NOARCH_INSTALLDIR([DIRECTORY])
+dnl --------------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable noarch_pkgconfigdir as the location where a
+dnl module should install arch-independent pkg-config .pc files. By
+dnl default the directory is $datadir/pkgconfig, but the default can be
+dnl changed by passing DIRECTORY. The user can override through the
+dnl --with-noarch-pkgconfigdir parameter.
+AC_DEFUN([PKG_NOARCH_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${datadir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config arch-independent installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([noarch-pkgconfigdir],
+    [AS_HELP_STRING([--with-noarch-pkgconfigdir], pkg_description)],,
+    [with_noarch_pkgconfigdir=]pkg_default)
+AC_SUBST([noarch_pkgconfigdir], [$with_noarch_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_NOARCH_INSTALLDIR
+
+
+dnl PKG_CHECK_VAR(VARIABLE, MODULE, CONFIG-VARIABLE,
+dnl [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------
+dnl Since: 0.28
+dnl
+dnl Retrieves the value of the pkg-config variable for the given module.
+AC_DEFUN([PKG_CHECK_VAR],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1], [value of $3 for $2, overriding pkg-config])dnl
+
+_PKG_CONFIG([$1], [variable="][$3]["], [$2])
+AS_VAR_COPY([$1], [pkg_cv_][$1])
+
+AS_VAR_IF([$1], [""], [$5], [$4])dnl
+])dnl PKG_CHECK_VAR
diff --git a/configure b/configure
index b5cdebb510..481d5d15e8 100755
--- a/configure
+++ b/configure
@@ -715,6 +715,12 @@ krb_srvtab
 with_python
 with_perl
 with_tcl
+ICU_LIBS
+ICU_CFLAGS
+PKG_CONFIG_LIBDIR
+PKG_CONFIG_PATH
+PKG_CONFIG
+with_icu
 enable_thread_safety
 INCLUDES
 autodepend
@@ -821,6 +827,7 @@ with_CC
 enable_depend
 enable_cassert
 enable_thread_safety
+with_icu
 with_tcl
 with_tclconfig
 with_perl
@@ -856,6 +863,11 @@ LDFLAGS
 LIBS
 CPPFLAGS
 CPP
+PKG_CONFIG
+PKG_CONFIG_PATH
+PKG_CONFIG_LIBDIR
+ICU_CFLAGS
+ICU_LIBS
 LDFLAGS_EX
 LDFLAGS_SL
 DOCBOOKSTYLE'
@@ -1511,6 +1523,7 @@ Optional Packages:
   --with-wal-segsize=SEGSIZE
                           set WAL segment size in MB [16]
   --with-CC=CMD           set compiler (deprecated)
+  --with-icu              build with ICU support
   --with-tcl              build Tcl modules (PL/Tcl)
   --with-tclconfig=DIR    tclConfig.sh is in DIR
   --with-perl             build Perl modules (PL/Perl)
@@ -1546,6 +1559,13 @@ Some influential environment variables:
   CPPFLAGS    (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if
               you have headers in a nonstandard directory <include dir>
   CPP         C preprocessor
+  PKG_CONFIG  path to pkg-config utility
+  PKG_CONFIG_PATH
+              directories to add to pkg-config's search path
+  PKG_CONFIG_LIBDIR
+              path overriding pkg-config's built-in search path
+  ICU_CFLAGS  C compiler flags for ICU, overriding pkg-config
+  ICU_LIBS    linker flags for ICU, overriding pkg-config
   LDFLAGS_EX  extra linker flags for linking executables only
   LDFLAGS_SL  extra linker flags for linking shared libraries only
   DOCBOOKSTYLE
@@ -5362,6 +5382,255 @@ $as_echo "$enable_thread_safety" >&6; }
 
 
 #
+# ICU
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with ICU support" >&5
+$as_echo_n "checking whether to build with ICU support... " >&6; }
+
+
+
+# Check whether --with-icu was given.
+if test "${with_icu+set}" = set; then :
+  withval=$with_icu;
+  case $withval in
+    yes)
+
+$as_echo "#define USE_ICU 1" >>confdefs.h
+
+      ;;
+    no)
+      :
+      ;;
+    *)
+      as_fn_error $? "no argument expected for --with-icu option" "$LINENO" 5
+      ;;
+  esac
+
+else
+  with_icu=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_icu" >&5
+$as_echo "$with_icu" >&6; }
+
+
+if test "$with_icu" = yes; then
+
+
+
+
+
+
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	if test -n "$ac_tool_prefix"; then
+  # Extract the first word of "${ac_tool_prefix}pkg-config", so it can be a program name with args.
+set dummy ${ac_tool_prefix}pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_PKG_CONFIG="$PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+PKG_CONFIG=$ac_cv_path_PKG_CONFIG
+if test -n "$PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $PKG_CONFIG" >&5
+$as_echo "$PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+fi
+if test -z "$ac_cv_path_PKG_CONFIG"; then
+  ac_pt_PKG_CONFIG=$PKG_CONFIG
+  # Extract the first word of "pkg-config", so it can be a program name with args.
+set dummy pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_ac_pt_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $ac_pt_PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_ac_pt_PKG_CONFIG="$ac_pt_PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_ac_pt_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+ac_pt_PKG_CONFIG=$ac_cv_path_ac_pt_PKG_CONFIG
+if test -n "$ac_pt_PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_pt_PKG_CONFIG" >&5
+$as_echo "$ac_pt_PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+  if test "x$ac_pt_PKG_CONFIG" = x; then
+    PKG_CONFIG=""
+  else
+    case $cross_compiling:$ac_tool_warned in
+yes:)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5
+$as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;}
+ac_tool_warned=yes ;;
+esac
+    PKG_CONFIG=$ac_pt_PKG_CONFIG
+  fi
+else
+  PKG_CONFIG="$ac_cv_path_PKG_CONFIG"
+fi
+
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=0.9.0
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: checking pkg-config is at least version $_pkg_min_version" >&5
+$as_echo_n "checking pkg-config is at least version $_pkg_min_version... " >&6; }
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+	else
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+		PKG_CONFIG=""
+	fi
+fi
+
+pkg_failed=no
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for ICU" >&5
+$as_echo_n "checking for ICU... " >&6; }
+
+if test -n "$ICU_CFLAGS"; then
+    pkg_cv_ICU_CFLAGS="$ICU_CFLAGS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_CFLAGS=`$PKG_CONFIG --cflags "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+if test -n "$ICU_LIBS"; then
+    pkg_cv_ICU_LIBS="$ICU_LIBS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_LIBS=`$PKG_CONFIG --libs "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+
+
+
+if test $pkg_failed = yes; then
+   	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi
+        if test $_pkg_short_errors_supported = yes; then
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        else
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$ICU_PKG_ERRORS" >&5
+
+	as_fn_error $? "Package requirements (icu-uc icu-i18n) were not met:
+
+$ICU_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details." "$LINENO" 5
+elif test $pkg_failed = untried; then
+     	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+	{ { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error $? "The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.
+See \`config.log' for more details" "$LINENO" 5; }
+else
+	ICU_CFLAGS=$pkg_cv_ICU_CFLAGS
+	ICU_LIBS=$pkg_cv_ICU_LIBS
+        { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+
+fi
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with Tcl" >&5
@@ -13433,6 +13702,50 @@ fi
 done
 
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ucol_strcollUTF8" >&5
+$as_echo_n "checking for ucol_strcollUTF8... " >&6; }
+if ${pgac_cv_func_ucol_strcollUTF8+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <unicode/ucol.h>
+
+int
+main ()
+{
+ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv_func_ucol_strcollUTF8=yes
+else
+  pgac_cv_func_ucol_strcollUTF8=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_func_ucol_strcollUTF8" >&5
+$as_echo "$pgac_cv_func_ucol_strcollUTF8" >&6; }
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+
+$as_echo "#define HAVE_UCOL_STRCOLLUTF8 1" >>confdefs.h
+
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/configure.in b/configure.in
index 1d99cda1d8..235b8745d6 100644
--- a/configure.in
+++ b/configure.in
@@ -614,6 +614,19 @@ AC_MSG_RESULT([$enable_thread_safety])
 AC_SUBST(enable_thread_safety)
 
 #
+# ICU
+#
+AC_MSG_CHECKING([whether to build with ICU support])
+PGAC_ARG_BOOL(with, icu, no, [build with ICU support],
+              [AC_DEFINE([USE_ICU], 1, [Define to build with ICU support. (--with-icu)])])
+AC_MSG_RESULT([$with_icu])
+AC_SUBST(with_icu)
+
+if test "$with_icu" = yes; then
+  PKG_CHECK_MODULES(ICU, icu-uc icu-i18n)
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 AC_MSG_CHECKING([whether to build with Tcl])
@@ -1634,6 +1647,28 @@ fi
 AC_CHECK_FUNCS([strtoll strtoq], [break])
 AC_CHECK_FUNCS([strtoull strtouq], [break])
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  AC_CACHE_CHECK([for ucol_strcollUTF8], [pgac_cv_func_ucol_strcollUTF8],
+[ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+AC_LINK_IFELSE([AC_LANG_PROGRAM(
+[#include <unicode/ucol.h>
+],
+[ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);])],
+[pgac_cv_func_ucol_strcollUTF8=yes],
+[pgac_cv_func_ucol_strcollUTF8=no])
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS])
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+    AC_DEFINE([HAVE_UCOL_STRCOLLUTF8], 1, [Define to 1 if you have the `ucol_strcollUTF8' function.])
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2c2da2ad8a..2f604eb25c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2016,6 +2016,14 @@ <title><structname>pg_collation</> Columns</title>
      </row>
 
      <row>
+      <entry><structfield>collprovider</structfield></entry>
+      <entry><type>char</type></entry>
+      <entry></entry>
+      <entry>Provider of the collation: <literal>d</literal> = database
+       default, <literal>c</literal> = libc, <literal>i</literal> = icu</entry>
+     </row>
+
+     <row>
       <entry><structfield>collencoding</structfield></entry>
       <entry><type>int4</type></entry>
       <entry></entry>
@@ -2036,6 +2044,17 @@ <title><structname>pg_collation</> Columns</title>
       <entry></entry>
       <entry><symbol>LC_CTYPE</> for this collation object</entry>
      </row>
+
+     <row>
+      <entry><structfield>collversion</structfield></entry>
+      <entry><type>text</type></entry>
+      <entry></entry>
+      <entry>
+       Provider-specific version of the collation.  This is recorded when the
+       collation is created and then checked when it is used, to detect
+       changes in the collation definition that could lead to data corruption.
+      </entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 2aba0fc528..ab83947db0 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -500,21 +500,45 @@ <title>Concepts</title>
    <title>Managing Collations</title>
 
    <para>
-    A collation is an SQL schema object that maps an SQL name to
-    operating system locales.  In particular, it maps to a combination
-    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>.  (As
+    A collation is an SQL schema object that maps an SQL name to locales
+    provided by libraries installed in the operating system.  A collation
+    definition has a <firstterm>provider</firstterm> that specifies which
+    library supplies the locale data.  One standard provider name
+    is <literal>libc</literal>, which uses the locales provided by the
+    operating system C library.  These are the locales that most tools
+    provided by the operating system use.  Another provider
+    is <literal>icu</literal>, which uses the external
+    ICU<indexterm><primary>ICU</></> library.  Support for ICU has to be
+    configured when PostgreSQL is built.
+   </para>
+
+   <para>
+    A collation object provided by <literal>libc</literal> maps to a combination
+    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol> settings.  (As
     the name would suggest, the main purpose of a collation is to set
     <symbol>LC_COLLATE</symbol>, which controls the sort order.  But
     it is rarely necessary in practice to have an
     <symbol>LC_CTYPE</symbol> setting that is different from
     <symbol>LC_COLLATE</symbol>, so it is more convenient to collect
     these under one concept than to create another infrastructure for
-    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a collation
+    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a <literal>libc</literal> collation
     is tied to a character set encoding (see <xref linkend="multibyte">).
     The same collation name may exist for different encodings.
    </para>
 
    <para>
+    A collation provided by <literal>icu</literal> maps to a named collator
+    provided by the ICU library.  ICU does not support
+    separate <quote>collate</quote> and <quote>ctype</quote> settings, so they
+    are always the same.  Also, ICU collations are independent of the
+    encoding, so there is always only one ICU collation for a given name in a
+    database.
+   </para>
+
+   <sect3>
+    <title>Standard Collations</title>
+
+   <para>
     On all platforms, the collations named <literal>default</>,
     <literal>C</>, and <literal>POSIX</> are available.  Additional
     collations may be available depending on operating system support.
@@ -528,12 +552,36 @@ <title>Managing Collations</title>
    </para>
 
    <para>
+    Additionally, the SQL standard collation name <literal>ucs_basic</literal>
+    is available for encoding <literal>UTF8</literal>.  It is equivalent
+    to <literal>C</literal> and sorts by Unicode code point.
+   </para>
+  </sect3>
+
+  <sect3>
+   <title>Predefined Collations</title>
+
+   <para>
     If the operating system provides support for using multiple locales
     within a single program (<function>newlocale</> and related functions),
+    or support for ICU is configured,
     then when a database cluster is initialized, <command>initdb</command>
     populates the system catalog <literal>pg_collation</literal> with
     collations based on all the locales it finds on the operating
-    system at the time.  For example, the operating system might
+    system at the time.
+   </para>
+
+   <para>
+    The inspect the currently available locales, use the query <literal>SELECT
+    * FROM pg_collation</literal>, or the command <command>\dOS+</command>
+    in <application>psql</application>.
+   </para>
+
+  <sect4>
+   <title>libc collations</title>
+
+   <para>
+    For example, the operating system might
     provide a locale named <literal>de_DE.utf8</literal>.
     <command>initdb</command> would then create a collation named
     <literal>de_DE.utf8</literal> for encoding <literal>UTF8</literal>
@@ -548,13 +596,14 @@ <title>Managing Collations</title>
    </para>
 
    <para>
-    In case a collation is needed that has different values for
-    <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>, a new
-    collation may be created using
-    the <xref linkend="sql-createcollation"> command.  That command
-    can also be used to create a new collation from an existing
-    collation, which can be useful to be able to use
-    operating-system-independent collation names in applications.
+    The default set of collations provided by <literal>libc</literal> map
+    directly to the locales installed in the operating system, which can be
+    listed using the command <literal>locale -a</literal>.
+    In case a <literal>libc</literal> collation is needed that has different values for
+    <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>, or new locales
+    are installed in the operating system after the database system was
+    initialized, then a new collation may be created using
+    the <xref linkend="sql-createcollation"> command.
    </para>
 
    <para>
@@ -566,8 +615,8 @@ <title>Managing Collations</title>
     Use of the stripped collation names is recommended, since it will
     make one less thing you need to change if you decide to change to
     another database encoding.  Note however that the <literal>default</>,
-    <literal>C</>, and <literal>POSIX</> collations can be used
-    regardless of the database encoding.
+    <literal>C</>, and <literal>POSIX</> collations, as well as all collations
+    provided by ICU can be used regardless of the database encoding.
    </para>
 
    <para>
@@ -581,6 +630,104 @@ <title>Managing Collations</title>
     collations have identical behaviors.  Mixing stripped and non-stripped
     collation names is therefore not recommended.
    </para>
+  </sect4>
+
+  <sect4>
+   <title>ICU collations</title>
+
+   <para>
+    Collations provided by ICU are created with names in BCP 47 language tag
+    format, with a <quote>private use</quote>
+    extension <literal>-x-icu</literal> appended, to distinguish them from
+    libc locales.  So <literal>de-x-icu</literal> would be an example.
+   </para>
+
+   <para>
+    With ICU, it is not sensible to enumerate all possible locale names.  ICU
+    uses a particular naming system for locales, but there are many more ways
+    to name a locale than there are actually distinct locales.  (In fact, any
+    string will be accepted as a locale name.)
+    See <ulink url="http://userguide.icu-project.org/locale"></ulink> for
+    information on ICU locale naming.  <command>initdb</command> uses the ICU
+    APIs to extract a set of locales with distinct collation rules to populate
+    the initial set of collations.  Here are some examples collations that
+    might be created:
+
+    <variablelist>
+     <varlistentry>
+      <term><literal>de-x-icu</literal></term>
+      <listitem>
+       <para>German collation, default variant</para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de-u-co-phonebk-x-icu</literal></term>
+      <listitem>
+       <para>German collation, phone book variant</para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de-AT-x-icu</literal></term>
+      <listitem>
+       <para>German collation for Austria, default variant</para>
+       <para>
+        (Note that as of this writing, there is no,
+        say, <literal>de-DE-x-icu</literal> or <literal>de-CH-x-icu</literal>,
+        because those are equivalent to <literal>de-x-icu</literal>.)
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de-AT-u-co-phonebk-x-icu</literal></term>
+      <listitem>
+       <para>German collation for Austria, phone book variant</para>
+      </listitem>
+     </varlistentry>
+     <varlistentry>
+      <term><literal>und-x-icu</literal> (for <quote>undefined</quote>)</term>
+      <listitem>
+       <para>
+        ICU <quote>root</quote> collation.  Use this to get a reasonable
+        language-agnostic sort order.
+       </para>
+      </listitem>
+     </varlistentry>
+    </variablelist>
+   </para>
+
+   <para>
+    Some (less frequently used) encodings are not supported by ICU.  If the
+    database cluster was initialized with such an encoding, no ICU collations
+    will be predefined.
+   </para>
+   </sect4>
+   </sect3>
+
+   <sect3>
+   <title>Copying Collations</title>
+
+   <para>
+    The command <xref linkend="sql-createcollation"> can also be used to
+    create a new collation from an existing collation, which can be useful to
+    be able to use operating-system-independent collation names in
+    applications, create compatibility names, or use an ICU-provided collation
+    under a more readable name.  For example:
+<programlisting>
+CREATE COLLATION german FROM "de_DE";
+CREATE COLLATION french FROM "fr-x-icu";
+CREATE COLLATION "de-DE-x-icu" FROM "de-x-icu";
+</programlisting>
+   </para>
+
+   <para>
+    The standard and predefined collations are in the
+    schema <literal>pg_catalog</literal>, like all predefined objects.
+    User-defined collations should be created in user schemas.  This also
+    ensures that they are saved by <command>pg_dump</command>.
+   </para>
   </sect2>
  </sect1>
 
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 568995c9f2..b35ffa1e5c 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -767,6 +767,20 @@ <title>Configuration</title>
       </varlistentry>
 
       <varlistentry>
+       <term><option>--with-icu</option></term>
+       <listitem>
+        <para>
+         Build with support for
+         the <productname>ICU</productname><indexterm><primary>ICU</></>
+         library.  This requires the <productname>ICU4C</productname> package
+         as well
+         as <productname>pkg-config</productname><indexterm><primary>pkg-config</></>
+         to be installed.
+        </para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
        <term><option>--with-openssl</option>
        <indexterm>
         <primary>OpenSSL</primary>
diff --git a/doc/src/sgml/ref/alter_collation.sgml b/doc/src/sgml/ref/alter_collation.sgml
index 6708c7e10e..6ff842e671 100644
--- a/doc/src/sgml/ref/alter_collation.sgml
+++ b/doc/src/sgml/ref/alter_collation.sgml
@@ -21,6 +21,8 @@
 
  <refsynopsisdiv>
 <synopsis>
+ALTER COLLATION <replaceable>name</replaceable> REFRESH VERSION
+
 ALTER COLLATION <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
 ALTER COLLATION <replaceable>name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
 ALTER COLLATION <replaceable>name</replaceable> SET SCHEMA <replaceable>new_schema</replaceable>
@@ -85,9 +87,49 @@ <title>Parameters</title>
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term><literal>REFRESH VERSION</literal></term>
+    <listitem>
+     <para>
+      Updated the collation version.
+      See <xref linkend="sql-altercollation-notes"> below.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </refsect1>
 
+ <refsect1 id="sql-altercollation-notes">
+  <title>Notes</title>
+
+  <para>
+   When using collations provided by the ICU library, the ICU-specific version
+   of the collator is recorded in the system catalog when the collation object
+   is created.  When the collation is then used, the current version is
+   checked against the recorded version, and a warning is issued when there is
+   a mismatch, for example:
+<screen>
+WARNING:  ICU collator version mismatch
+DETAIL:  The database was created using version 1.2.3.4, the library provides version 2.3.4.5.
+HINT:  Rebuild all objects affected by this collation and run ALTER COLLATION pg_catalog."xx-x-icu" REFRESH VERSION, or build PostgreSQL with the right version of ICU.
+</screen>
+   A change in collation definitions can lead to corrupt indexes and other
+   problems where the database system relies on stored objects having a
+   certain sort order.  Generally, this should be avoided, but it can happen
+   in legitimate circumstances, such as when
+   using <command>pg_upgrade</command> to upgrade to server binaries linked
+   with a newer version of ICU.  When this happens, all objects depending on
+   the collation should be rebuilt, for example,
+   using <command>REINDEX</command>.  When that is done, the collation version
+   can be refreshed using the command <literal>ALTER COLLATION ... REFRESH
+   VERSION</literal>.  This will update the system catalog to record the
+   current collator version and will make the warning go away.  Note that this
+   does not actually check whether all affected objects have been rebuilt
+   correctly.
+  </para>
+ </refsect1>
+
  <refsect1>
   <title>Examples</title>
 
diff --git a/doc/src/sgml/ref/create_collation.sgml b/doc/src/sgml/ref/create_collation.sgml
index c09e5bd6d4..ee7a8ea5ae 100644
--- a/doc/src/sgml/ref/create_collation.sgml
+++ b/doc/src/sgml/ref/create_collation.sgml
@@ -21,7 +21,8 @@
 CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> (
     [ LOCALE = <replaceable>locale</replaceable>, ]
     [ LC_COLLATE = <replaceable>lc_collate</replaceable>, ]
-    [ LC_CTYPE = <replaceable>lc_ctype</replaceable> ]
+    [ LC_CTYPE = <replaceable>lc_ctype</replaceable>, ]
+    [ PROVIDER = <replaceable>provider</replaceable> ]
 )
 CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replaceable>existing_collation</replaceable>
 </synopsis>
@@ -114,6 +115,19 @@ <title>Parameters</title>
     </varlistentry>
 
     <varlistentry>
+     <term><replaceable>provider</replaceable></term>
+
+     <listitem>
+      <para>
+       Specifies the provider to use for locale services associated with this
+       collation.  Possible values
+       are: <literal>icu</literal>,<indexterm><primary>ICU</></> <literal>libc</literal>.
+       The available choices depend on the operating system and build options.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
      <term><replaceable>existing_collation</replaceable></term>
 
      <listitem>
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index e7862016aa..59f333d9b3 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -179,6 +179,7 @@ pgxsdir = $(pkglibdir)/pgxs
 #
 # Records the choice of the various --enable-xxx and --with-xxx options.
 
+with_icu	= @with_icu@
 with_perl	= @with_perl@
 with_python	= @with_python@
 with_tcl	= @with_tcl@
@@ -208,6 +209,9 @@ python_version		= @python_version@
 
 krb_srvtab = @krb_srvtab@
 
+ICU_CFLAGS		= @ICU_CFLAGS@
+ICU_LIBS		= @ICU_LIBS@
+
 TCLSH			= @TCLSH@
 TCL_LIBS		= @TCL_LIBS@
 TCL_LIB_SPEC		= @TCL_LIB_SPEC@
diff --git a/src/backend/Makefile b/src/backend/Makefile
index 7a0bbb2942..fffb0d95ba 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -58,7 +58,7 @@ ifneq ($(PORTNAME), win32)
 ifneq ($(PORTNAME), aix)
 
 postgres: $(OBJS)
-	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) -o $@
+	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) $(ICU_LIBS) -o $@
 
 endif
 endif
diff --git a/src/backend/catalog/pg_collation.c b/src/backend/catalog/pg_collation.c
index 65b6051c0d..ffe68aa114 100644
--- a/src/backend/catalog/pg_collation.c
+++ b/src/backend/catalog/pg_collation.c
@@ -27,6 +27,7 @@
 #include "mb/pg_wchar.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
+#include "utils/pg_locale.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
@@ -40,10 +41,12 @@
 Oid
 CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
 				const char *collcollate, const char *collctype,
 				bool if_not_exists)
 {
+	char	   *collversion;
 	Relation	rel;
 	TupleDesc	tupDesc;
 	HeapTuple	tup;
@@ -78,21 +81,27 @@ CollationCreate(const char *collname, Oid collnamespace,
 		{
 			ereport(NOTICE,
 				(errcode(ERRCODE_DUPLICATE_OBJECT),
-				 errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping",
-						collname, pg_encoding_to_char(collencoding))));
+				 collencoding == -1
+				 ? errmsg("collation \"%s\" already exists, skipping",
+						  collname)
+				 : errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping",
+						  collname, pg_encoding_to_char(collencoding))));
 			return InvalidOid;
 		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_DUPLICATE_OBJECT),
-					 errmsg("collation \"%s\" for encoding \"%s\" already exists",
-							collname, pg_encoding_to_char(collencoding))));
+					 collencoding == -1
+					 ? errmsg("collation \"%s\" already exists",
+							  collname)
+					 : errmsg("collation \"%s\" for encoding \"%s\" already exists",
+							  collname, pg_encoding_to_char(collencoding))));
 	}
 
 	/*
 	 * Also forbid matching an any-encoding entry.  This test of course is not
 	 * backed up by the unique index, but it's not a problem since we don't
-	 * support adding any-encoding entries after initdb.
+	 * support adding any-encoding entries after initdb. FIXME
 	 */
 	if (SearchSysCacheExists3(COLLNAMEENCNSP,
 							  PointerGetDatum(collname),
@@ -114,6 +123,8 @@ CollationCreate(const char *collname, Oid collnamespace,
 						collname)));
 	}
 
+	collversion = get_system_collation_version(collprovider, collcollate);
+
 	/* open pg_collation */
 	rel = heap_open(CollationRelationId, RowExclusiveLock);
 	tupDesc = RelationGetDescr(rel);
@@ -125,11 +136,16 @@ CollationCreate(const char *collname, Oid collnamespace,
 	values[Anum_pg_collation_collname - 1] = NameGetDatum(&name_name);
 	values[Anum_pg_collation_collnamespace - 1] = ObjectIdGetDatum(collnamespace);
 	values[Anum_pg_collation_collowner - 1] = ObjectIdGetDatum(collowner);
+	values[Anum_pg_collation_collprovider - 1] = CharGetDatum(collprovider);
 	values[Anum_pg_collation_collencoding - 1] = Int32GetDatum(collencoding);
 	namestrcpy(&name_collate, collcollate);
 	values[Anum_pg_collation_collcollate - 1] = NameGetDatum(&name_collate);
 	namestrcpy(&name_ctype, collctype);
 	values[Anum_pg_collation_collctype - 1] = NameGetDatum(&name_ctype);
+	if (collversion)
+		values[Anum_pg_collation_collversion - 1] = CStringGetTextDatum(collversion);
+	else
+		nulls[Anum_pg_collation_collversion - 1] = true;
 
 	tup = heap_form_tuple(tupDesc, values, nulls);
 
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 919cfc6a06..dde06cf12f 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -14,15 +14,18 @@
  */
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/dependency.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/objectaccess.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_collation_fn.h"
 #include "commands/alter.h"
 #include "commands/collationcmds.h"
+#include "commands/comment.h"
 #include "commands/dbcommands.h"
 #include "commands/defrem.h"
 #include "mb/pg_wchar.h"
@@ -33,6 +36,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
+
 /*
  * CREATE COLLATION
  */
@@ -47,8 +51,12 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 	DefElem    *localeEl = NULL;
 	DefElem    *lccollateEl = NULL;
 	DefElem    *lcctypeEl = NULL;
+	DefElem    *providerEl = NULL;
 	char	   *collcollate = NULL;
 	char	   *collctype = NULL;
+	char	   *collproviderstr = NULL;
+	int			collencoding;
+	char		collprovider = 0;
 	Oid			newoid;
 	ObjectAddress address;
 
@@ -72,6 +80,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 			defelp = &lccollateEl;
 		else if (pg_strcasecmp(defel->defname, "lc_ctype") == 0)
 			defelp = &lcctypeEl;
+		else if (pg_strcasecmp(defel->defname, "provider") == 0)
+			defelp = &providerEl;
 		else
 		{
 			ereport(ERROR,
@@ -103,6 +113,7 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 
 		collcollate = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collcollate));
 		collctype = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collctype));
+		collprovider = ((Form_pg_collation) GETSTRUCT(tp))->collprovider;
 
 		ReleaseSysCache(tp);
 	}
@@ -119,6 +130,24 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 	if (lcctypeEl)
 		collctype = defGetString(lcctypeEl);
 
+	if (providerEl)
+		collproviderstr = defGetString(providerEl);
+
+	if (collproviderstr)
+	{
+		if (pg_strcasecmp(collproviderstr, "icu") == 0)
+			collprovider = COLLPROVIDER_ICU;
+		else if (pg_strcasecmp(collproviderstr, "libc") == 0)
+			collprovider = COLLPROVIDER_LIBC;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+					 errmsg("unrecognized collation provider: %s",
+							collproviderstr)));
+	}
+	else if (!fromEl)
+		collprovider = COLLPROVIDER_LIBC;
+
 	if (!collcollate)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
@@ -129,12 +158,19 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
 				 errmsg("parameter \"lc_ctype\" must be specified")));
 
-	check_encoding_locale_matches(GetDatabaseEncoding(), collcollate, collctype);
+	if (collprovider == COLLPROVIDER_ICU)
+		collencoding = -1;
+	else
+	{
+		collencoding = GetDatabaseEncoding();
+		check_encoding_locale_matches(collencoding, collcollate, collctype);
+	}
 
 	newoid = CollationCreate(collName,
 							 collNamespace,
 							 GetUserId(),
-							 GetDatabaseEncoding(),
+							 collprovider,
+							 collencoding,
 							 collcollate,
 							 collctype,
 							 if_not_exists);
@@ -182,16 +218,89 @@ IsThereCollationInNamespace(const char *collname, Oid nspOid)
 						collname, get_namespace_name(nspOid))));
 }
 
+/*
+ * ALTER COLLATION
+ */
+ObjectAddress
+AlterCollation(AlterCollationStmt *stmt)
+{
+	Relation	rel;
+	Oid			collOid;
+	HeapTuple	tup;
+	Form_pg_collation collForm;
+	Datum		collversion;
+	bool		isnull;
+	char	   *oldversion;
+	char	   *newversion;
+	ObjectAddress address;
+
+	rel = heap_open(CollationRelationId, RowExclusiveLock);
+	collOid = get_collation_oid(stmt->collname, false);
+
+	if (!pg_collation_ownercheck(collOid, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_COLLATION,
+					   NameListToString(stmt->collname));
+
+	tup = SearchSysCacheCopy1(COLLOID, ObjectIdGetDatum(collOid));
+	if (!HeapTupleIsValid(tup))
+		elog(ERROR, "cache lookup failed for collation %u", collOid);
+
+	collForm = (Form_pg_collation) GETSTRUCT(tup);
+	collversion = SysCacheGetAttr(COLLOID, tup, Anum_pg_collation_collversion,
+								  &isnull);
+	oldversion = isnull ? NULL : TextDatumGetCString(collversion);
+
+	newversion = get_system_collation_version(collForm->collprovider, NameStr(collForm->collcollate));
+
+	/* cannot change from NULL to non-NULL or vice versa */
+	if ((!oldversion && newversion) || (oldversion && !newversion))
+		elog(ERROR, "invalid collation version change");
+	else if (oldversion && newversion && strcmp(newversion, oldversion) != 0)
+	{
+		bool        nulls[Natts_pg_collation];
+		bool        replaces[Natts_pg_collation];
+		Datum       values[Natts_pg_collation];
+
+		ereport(NOTICE,
+				(errmsg("changing version from %s to %s",
+						oldversion, newversion)));
+
+		memset(values, 0, sizeof(values));
+		memset(nulls, false, sizeof(nulls));
+		memset(replaces, false, sizeof(replaces));
+
+		values[Anum_pg_collation_collversion - 1] = CStringGetTextDatum(newversion);
+		replaces[Anum_pg_collation_collversion - 1] = true;
+
+		tup = heap_modify_tuple(tup, RelationGetDescr(rel),
+								values, nulls, replaces);
+	}
+	else
+		ereport(NOTICE,
+				(errmsg("version has not changed")));
+
+	CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+	InvokeObjectPostAlterHook(CollationRelationId, collOid, 0);
+
+	ObjectAddressSet(address, CollationRelationId, collOid);
+
+	heap_freetuple(tup);
+	heap_close(rel, NoLock);
+
+	return address;
+}
+
 
 /*
- * "Normalize" a locale name, stripping off encoding tags such as
+ * "Normalize" a libc locale name, stripping off encoding tags such as
  * ".utf8" (e.g., "en_US.utf8" -> "en_US", but "br_FR.iso885915@euro"
  * -> "br_FR@euro").  Return true if a new, different name was
  * generated.
  */
 pg_attribute_unused()
 static bool
-normalize_locale_name(char *new, const char *old)
+normalize_libc_locale_name(char *new, const char *old)
 {
 	char	   *n = new;
 	const char *o = old;
@@ -219,6 +328,46 @@ normalize_locale_name(char *new, const char *old)
 }
 
 
+#ifdef USE_ICU
+static char *
+get_icu_language_tag(const char *localename)
+{
+	char		buf[ULOC_FULLNAME_CAPACITY];
+	UErrorCode	status;
+
+	status = U_ZERO_ERROR;
+	uloc_toLanguageTag(localename, buf, sizeof(buf), TRUE, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("could not convert locale name \"%s\" to language tag: %s",
+						localename, u_errorName(status))));
+
+	return pstrdup(buf);
+}
+
+
+static char *
+get_icu_locale_comment(const char *localename)
+{
+	UErrorCode	status;
+	UChar		displayname[128];
+	int32		len_uchar;
+	char	   *result;
+
+	status = U_ZERO_ERROR;
+	len_uchar = uloc_getDisplayName(localename, "en", &displayname[0], sizeof(displayname), &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("could get display name for locale \"%s\": %s",
+						localename, u_errorName(status))));
+
+	icu_from_uchar(&result, displayname, len_uchar);
+
+	return result;
+}
+#endif	/* USE_ICU */
+
+
 Datum
 pg_import_system_collations(PG_FUNCTION_ARGS)
 {
@@ -302,7 +451,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 
 		count++;
 
-		CollationCreate(localebuf, nspid, GetUserId(), enc,
+		CollationCreate(localebuf, nspid, GetUserId(), COLLPROVIDER_LIBC, enc,
 						localebuf, localebuf, if_not_exists);
 
 		CommandCounterIncrement();
@@ -316,7 +465,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 		 * "locale -a" output.  So save up the aliases and try to add them
 		 * after we've read all the output.
 		 */
-		if (normalize_locale_name(alias, localebuf))
+		if (normalize_libc_locale_name(alias, localebuf))
 		{
 			aliaslist = lappend(aliaslist, pstrdup(alias));
 			localelist = lappend(localelist, pstrdup(localebuf));
@@ -333,7 +482,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 		char	   *locale = (char *) lfirst(lcl);
 		int			enc = lfirst_int(lce);
 
-		CollationCreate(alias, nspid, GetUserId(), enc,
+		CollationCreate(alias, nspid, GetUserId(), COLLPROVIDER_LIBC, enc,
 						locale, locale, true);
 		CommandCounterIncrement();
 	}
@@ -343,5 +492,85 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 				(errmsg("no usable system locales were found")));
 #endif   /* not HAVE_LOCALE_T && not WIN32 */
 
+#ifdef USE_ICU
+	if (!is_encoding_supported_by_icu(GetDatabaseEncoding()))
+	{
+		ereport(NOTICE,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("encoding \"%s\" not supported by ICU",
+						pg_encoding_to_char(GetDatabaseEncoding()))));
+	}
+	else
+	{
+		int i;
+
+		/*
+		 * Start the loop at -1 to sneak in the root locale without too much
+		 * code duplication.
+		 */
+		for (i = -1; i < ucol_countAvailable(); i++)
+		{
+			const char *name;
+			char	   *langtag;
+			UEnumeration *en;
+			UErrorCode	status;
+			const char *val;
+			Oid			collid;
+
+			if (i == -1)
+				name = "";  /* ICU root locale */
+			else
+				name = ucol_getAvailable(i);
+
+			langtag = get_icu_language_tag(name);
+			collid = CollationCreate(psprintf("%s-x-icu", langtag),
+									 nspid, GetUserId(), COLLPROVIDER_ICU, -1,
+#if U_ICU_VERSION_MAJOR_NUM >= 54
+									 langtag, langtag,
+#else
+									 name, name,
+#endif
+									 if_not_exists);
+
+			CreateComments(collid, CollationRelationId, 0,
+						   get_icu_locale_comment(name));
+
+			/*
+			 * Add keyword variants
+			 */
+			status = U_ZERO_ERROR;
+			en = ucol_getKeywordValuesForLocale("collation", name, TRUE, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not get keyword values for locale \"%s\": %s",
+								name, u_errorName(status))));
+
+			status = U_ZERO_ERROR;
+			uenum_reset(en, &status);
+			while ((val = uenum_next(en, NULL, &status)))
+			{
+				char *localeid = psprintf("%s@collation=%s", name, val);
+
+				langtag =  get_icu_language_tag(localeid);
+				collid = CollationCreate(psprintf("%s-x-icu", langtag),
+										 nspid, GetUserId(), COLLPROVIDER_ICU, -1,
+#if U_ICU_VERSION_MAJOR_NUM >= 54
+										 langtag, langtag,
+#else
+										 localeid, localeid,
+#endif
+										 if_not_exists);
+				CreateComments(collid, CollationRelationId, 0,
+							   get_icu_locale_comment(localeid));
+			}
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not get keyword values for locale \"%s\": %s",
+								name, u_errorName(status))));
+			uenum_close(en);
+		}
+	}
+#endif
+
 	PG_RETURN_VOID();
 }
diff --git a/src/backend/common.mk b/src/backend/common.mk
index 5d599dbd0c..0b57543bc4 100644
--- a/src/backend/common.mk
+++ b/src/backend/common.mk
@@ -8,6 +8,8 @@
 # this directory and SUBDIRS to subdirectories containing more things
 # to build.
 
+override CPPFLAGS := $(CPPFLAGS) $(ICU_CFLAGS)
+
 ifdef PARTIAL_LINKING
 # old style: linking using SUBSYS.o
 subsysfilename = SUBSYS.o
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index bfc2ac1716..71daabbf97 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3028,6 +3028,16 @@ _copyAlterTableCmd(const AlterTableCmd *from)
 	return newnode;
 }
 
+static AlterCollationStmt *
+_copyAlterCollationStmt(const AlterCollationStmt *from)
+{
+	AlterCollationStmt *newnode = makeNode(AlterCollationStmt);
+
+	COPY_NODE_FIELD(collname);
+
+	return newnode;
+}
+
 static AlterDomainStmt *
 _copyAlterDomainStmt(const AlterDomainStmt *from)
 {
@@ -4959,6 +4969,9 @@ copyObject(const void *from)
 		case T_AlterTableCmd:
 			retval = _copyAlterTableCmd(from);
 			break;
+		case T_AlterCollationStmt:
+			retval = _copyAlterCollationStmt(from);
+			break;
 		case T_AlterDomainStmt:
 			retval = _copyAlterDomainStmt(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 54e9c983a0..57fe7ca61e 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1087,6 +1087,14 @@ _equalAlterTableCmd(const AlterTableCmd *a, const AlterTableCmd *b)
 }
 
 static bool
+_equalAlterCollationStmt(const AlterCollationStmt *a, const AlterCollationStmt *b)
+{
+	COMPARE_NODE_FIELD(collname);
+
+	return true;
+}
+
+static bool
 _equalAlterDomainStmt(const AlterDomainStmt *a, const AlterDomainStmt *b)
 {
 	COMPARE_SCALAR_FIELD(subtype);
@@ -3156,6 +3164,9 @@ equal(const void *a, const void *b)
 		case T_AlterTableCmd:
 			retval = _equalAlterTableCmd(a, b);
 			break;
+		case T_AlterCollationStmt:
+			retval = _equalAlterCollationStmt(a, b);
+			break;
 		case T_AlterDomainStmt:
 			retval = _equalAlterDomainStmt(a, b);
 			break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e7acc2d9a2..e337abd78d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -244,7 +244,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 }
 
 %type <node>	stmt schema_stmt
-		AlterEventTrigStmt
+		AlterEventTrigStmt AlterCollationStmt
 		AlterDatabaseStmt AlterDatabaseSetStmt AlterDomainStmt AlterEnumStmt
 		AlterFdwStmt AlterForeignServerStmt AlterGroupStmt
 		AlterObjectDependsStmt AlterObjectSchemaStmt AlterOwnerStmt
@@ -812,6 +812,7 @@ stmtmulti:	stmtmulti ';' stmt
 
 stmt :
 			AlterEventTrigStmt
+			| AlterCollationStmt
 			| AlterDatabaseStmt
 			| AlterDatabaseSetStmt
 			| AlterDefaultPrivilegesStmt
@@ -9633,6 +9634,21 @@ DropdbStmt: DROP DATABASE database_name
 
 /*****************************************************************************
  *
+ *		ALTER COLLATION
+ *
+ *****************************************************************************/
+
+AlterCollationStmt: ALTER COLLATION any_name REFRESH VERSION_P
+				{
+					AlterCollationStmt *n = makeNode(AlterCollationStmt);
+					n->collname = $3;
+					$$ = (Node *)n;
+				}
+		;
+
+
+/*****************************************************************************
+ *
  *		ALTER SYSTEM
  *
  * This is used to change configuration parameters persistently.
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index 0121cbb2ad..4bdcb4fd6a 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -68,7 +68,8 @@ typedef enum
 	PG_REGEX_LOCALE_WIDE,		/* Use <wctype.h> functions */
 	PG_REGEX_LOCALE_1BYTE,		/* Use <ctype.h> functions */
 	PG_REGEX_LOCALE_WIDE_L,		/* Use locale_t <wctype.h> functions */
-	PG_REGEX_LOCALE_1BYTE_L		/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_1BYTE_L,	/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_ICU			/* Use ICU uchar.h functions */
 } PG_Locale_Strategy;
 
 static PG_Locale_Strategy pg_regex_strategy;
@@ -262,6 +263,11 @@ pg_set_regex_collation(Oid collation)
 					 errhint("Use the COLLATE clause to set the collation explicitly.")));
 		}
 
+#ifdef USE_ICU
+		if (pg_regex_locale && pg_regex_locale->provider == COLLPROVIDER_ICU)
+			pg_regex_strategy = PG_REGEX_LOCALE_ICU;
+		else
+#endif
 #ifdef USE_WIDE_UPPER_LOWER
 		if (GetDatabaseEncoding() == PG_UTF8)
 		{
@@ -303,13 +309,18 @@ pg_wc_isdigit(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswdigit_l((wint_t) c, pg_regex_locale);
+				return iswdigit_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isdigit_l((unsigned char) c, pg_regex_locale));
+					isdigit_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isdigit(c);
 #endif
 			break;
 	}
@@ -336,13 +347,18 @@ pg_wc_isalpha(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalpha_l((wint_t) c, pg_regex_locale);
+				return iswalpha_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalpha_l((unsigned char) c, pg_regex_locale));
+					isalpha_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalpha(c);
 #endif
 			break;
 	}
@@ -369,13 +385,18 @@ pg_wc_isalnum(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalnum_l((wint_t) c, pg_regex_locale);
+				return iswalnum_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalnum_l((unsigned char) c, pg_regex_locale));
+					isalnum_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalnum(c);
 #endif
 			break;
 	}
@@ -402,13 +423,18 @@ pg_wc_isupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswupper_l((wint_t) c, pg_regex_locale);
+				return iswupper_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isupper_l((unsigned char) c, pg_regex_locale));
+					isupper_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isupper(c);
 #endif
 			break;
 	}
@@ -435,13 +461,18 @@ pg_wc_islower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswlower_l((wint_t) c, pg_regex_locale);
+				return iswlower_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					islower_l((unsigned char) c, pg_regex_locale));
+					islower_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_islower(c);
 #endif
 			break;
 	}
@@ -468,13 +499,18 @@ pg_wc_isgraph(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswgraph_l((wint_t) c, pg_regex_locale);
+				return iswgraph_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isgraph_l((unsigned char) c, pg_regex_locale));
+					isgraph_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isgraph(c);
 #endif
 			break;
 	}
@@ -501,13 +537,18 @@ pg_wc_isprint(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswprint_l((wint_t) c, pg_regex_locale);
+				return iswprint_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isprint_l((unsigned char) c, pg_regex_locale));
+					isprint_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isprint(c);
 #endif
 			break;
 	}
@@ -534,13 +575,18 @@ pg_wc_ispunct(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswpunct_l((wint_t) c, pg_regex_locale);
+				return iswpunct_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					ispunct_l((unsigned char) c, pg_regex_locale));
+					ispunct_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_ispunct(c);
 #endif
 			break;
 	}
@@ -567,13 +613,18 @@ pg_wc_isspace(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswspace_l((wint_t) c, pg_regex_locale);
+				return iswspace_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isspace_l((unsigned char) c, pg_regex_locale));
+					isspace_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isspace(c);
 #endif
 			break;
 	}
@@ -608,15 +659,20 @@ pg_wc_toupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towupper_l((wint_t) c, pg_regex_locale);
+				return towupper_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return toupper_l((unsigned char) c, pg_regex_locale);
+				return toupper_l((unsigned char) c, pg_regex_locale->info.lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_toupper(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -649,15 +705,20 @@ pg_wc_tolower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towlower_l((wint_t) c, pg_regex_locale);
+				return towlower_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return tolower_l((unsigned char) c, pg_regex_locale);
+				return tolower_l((unsigned char) c, pg_regex_locale->info.lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_tolower(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -808,6 +869,9 @@ pg_ctype_get_cache(pg_wc_probefunc probefunc, int cclasscode)
 			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
 #endif
 			break;
+		case PG_REGEX_LOCALE_ICU:
+			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
+			break;
 		default:
 			max_chr = 0;		/* can't get here, but keep compiler quiet */
 			break;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 20b5273405..c8d20fffea 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1623,6 +1623,10 @@ ProcessUtilitySlow(ParseState *pstate,
 				commandCollected = true;
 				break;
 
+			case T_AlterCollationStmt:
+				address = AlterCollation((AlterCollationStmt *) parsetree);
+				break;
+
 			default:
 				elog(ERROR, "unrecognized node type: %d",
 					 (int) nodeTag(parsetree));
@@ -2673,6 +2677,10 @@ CreateCommandTag(Node *parsetree)
 			tag = "DROP SUBSCRIPTION";
 			break;
 
+		case T_AlterCollationStmt:
+			tag = "ALTER COLLATION";
+			break;
+
 		case T_PrepareStmt:
 			tag = "PREPARE";
 			break;
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index c16bfbca93..0566abd314 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -82,6 +82,10 @@
 #include <wctype.h>
 #endif
 
+#ifdef USE_ICU
+#include <unicode/ustring.h>
+#endif
+
 #include "catalog/pg_collation.h"
 #include "mb/pg_wchar.h"
 #include "utils/builtins.h"
@@ -1443,6 +1447,42 @@ str_numth(char *dest, char *num, int type)
  *			upper/lower/initcap functions
  *****************************************************************************/
 
+#ifdef USE_ICU
+static int32_t
+icu_convert_case(int32_t (*func)(UChar *, int32_t, const UChar *, int32_t, const char *, UErrorCode *),
+				 pg_locale_t mylocale, UChar **buff_dest, UChar *buff_source, int32_t len_source)
+{
+	UErrorCode	status;
+	int32_t		len_dest;
+
+	len_dest = len_source;  /* try first with same length */
+	*buff_dest = palloc(len_dest * sizeof(**buff_dest));
+	status = U_ZERO_ERROR;
+	len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->info.icu.locale, &status);
+	if (status == U_BUFFER_OVERFLOW_ERROR)
+	{
+		/* try again with adjusted length */
+		pfree(buff_dest);
+		buff_dest = palloc(len_dest * sizeof(**buff_dest));
+		status = U_ZERO_ERROR;
+		len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->info.icu.locale, &status);
+	}
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("case conversion failed: %s", u_errorName(status))));
+	return len_dest;
+}
+
+static int32_t
+u_strToTitle_default_BI(UChar *dest, int32_t destCapacity,
+						const UChar *src, int32_t srcLength,
+						const char *locale,
+						UErrorCode *pErrorCode)
+{
+	return u_strToTitle(dest, destCapacity, src, srcLength, NULL, locale, pErrorCode);
+}
+#endif
+
 /*
  * If the system provides the needed functions for wide-character manipulation
  * (which are all standardized by C99), then we implement upper/lower/initcap
@@ -1479,12 +1519,9 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 		result = asc_tolower(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1502,77 +1539,79 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar;
+			int32_t		len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToLower, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-			else
+					if (mylocale)
+						workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->info.lt);
+					else
 #endif
-				workspace[curr_char] = towlower(workspace[curr_char]);
-		}
+						workspace[curr_char] = towlower(workspace[curr_char]);
+				}
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
+			}
 #endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
-
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
+			else
 			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for lower() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that tolower_l() will not be so broken as to need
-		 * an isupper_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that tolower_l() will not be so broken as to need
+				 * an isupper_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = tolower_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = tolower_l((unsigned char) *p, mylocale->info.lt);
+					else
 #endif
-				*p = pg_tolower((unsigned char) *p);
+						*p = pg_tolower((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1599,12 +1638,9 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 		result = asc_toupper(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1622,77 +1658,78 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToUpper, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-			else
-#endif
-				workspace[curr_char] = towupper(workspace[curr_char]);
-		}
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
+					if (mylocale)
+						workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->info.lt);
+					else
 #endif
-		char	   *p;
+						workspace[curr_char] = towupper(workspace[curr_char]);
+				}
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for upper() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
+
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+#endif   /* USE_WIDE_UPPER_LOWER */
+			else
+			{
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that toupper_l() will not be so broken as to need
-		 * an islower_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that toupper_l() will not be so broken as to need
+				 * an islower_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = toupper_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = toupper_l((unsigned char) *p, mylocale->info.lt);
+					else
 #endif
-				*p = pg_toupper((unsigned char) *p);
+						*p = pg_toupper((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1720,12 +1757,9 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 		result = asc_initcap(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1743,100 +1777,101 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
-
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
-
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
-
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
 		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-				else
-					workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-				wasalnum = iswalnum_l(workspace[curr_char], mylocale);
-			}
-			else
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
 #endif
+		{
+			if (pg_database_encoding_max_length() > 1)
 			{
-				if (wasalnum)
-					workspace[curr_char] = towlower(workspace[curr_char]);
-				else
-					workspace[curr_char] = towupper(workspace[curr_char]);
-				wasalnum = iswalnum(workspace[curr_char]);
-			}
-		}
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for initcap() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
+					if (mylocale)
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->info.lt);
+						else
+							workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->info.lt);
+						wasalnum = iswalnum_l(workspace[curr_char], mylocale->info.lt);
+					}
+					else
 #endif
-		}
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower(workspace[curr_char]);
+						else
+							workspace[curr_char] = towupper(workspace[curr_char]);
+						wasalnum = iswalnum(workspace[curr_char]);
+					}
+				}
 
-		result = pnstrdup(buff, nbytes);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		/*
-		 * Note: we assume that toupper_l()/tolower_l() will not be so broken
-		 * as to need guard tests.  When using the default collation, we apply
-		 * the traditional Postgres behavior that forces ASCII-style treatment
-		 * of I/i, but in non-default collations you get exactly what the
-		 * collation says.
-		 */
-		for (p = result; *p; p++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					*p = tolower_l((unsigned char) *p, mylocale);
-				else
-					*p = toupper_l((unsigned char) *p, mylocale);
-				wasalnum = isalnum_l((unsigned char) *p, mylocale);
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
+#endif   /* USE_WIDE_UPPER_LOWER */
 			else
-#endif
 			{
-				if (wasalnum)
-					*p = pg_tolower((unsigned char) *p);
-				else
-					*p = pg_toupper((unsigned char) *p);
-				wasalnum = isalnum((unsigned char) *p);
+				char	   *p;
+
+				result = pnstrdup(buff, nbytes);
+
+				/*
+				 * Note: we assume that toupper_l()/tolower_l() will not be so broken
+				 * as to need guard tests.  When using the default collation, we apply
+				 * the traditional Postgres behavior that forces ASCII-style treatment
+				 * of I/i, but in non-default collations you get exactly what the
+				 * collation says.
+				 */
+				for (p = result; *p; p++)
+				{
+#ifdef HAVE_LOCALE_T
+					if (mylocale)
+					{
+						if (wasalnum)
+							*p = tolower_l((unsigned char) *p, mylocale->info.lt);
+						else
+							*p = toupper_l((unsigned char) *p, mylocale->info.lt);
+						wasalnum = isalnum_l((unsigned char) *p, mylocale->info.lt);
+					}
+					else
+#endif
+					{
+						if (wasalnum)
+							*p = pg_tolower((unsigned char) *p);
+						else
+							*p = pg_toupper((unsigned char) *p);
+						wasalnum = isalnum((unsigned char) *p);
+					}
+				}
 			}
 		}
 	}
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index 8d9d285fb5..1f683ccd0f 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -96,7 +96,7 @@ SB_lower_char(unsigned char c, pg_locale_t locale, bool locale_is_c)
 		return pg_ascii_tolower(c);
 #ifdef HAVE_LOCALE_T
 	else if (locale)
-		return tolower_l(c, locale);
+		return tolower_l(c, locale->info.lt);
 #endif
 	else
 		return pg_tolower(c);
@@ -165,14 +165,36 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 			   *p;
 	int			slen,
 				plen;
+	pg_locale_t locale = 0;
+	bool		locale_is_c = false;
+
+	if (lc_ctype_is_c(collation))
+		locale_is_c = true;
+	else if (collation != DEFAULT_COLLATION_OID)
+	{
+		if (!OidIsValid(collation))
+		{
+			/*
+			 * This typically means that the parser could not resolve a
+			 * conflict of implicit collations, so report it that way.
+			 */
+			ereport(ERROR,
+					(errcode(ERRCODE_INDETERMINATE_COLLATION),
+					 errmsg("could not determine which collation to use for ILIKE"),
+					 errhint("Use the COLLATE clause to set the collation explicitly.")));
+		}
+		locale = pg_newlocale_from_collation(collation);
+	}
 
 	/*
 	 * For efficiency reasons, in the single byte case we don't call lower()
 	 * on the pattern and text, but instead call SB_lower_char on each
-	 * character.  In the multi-byte case we don't have much choice :-(
+	 * character.  In the multi-byte case we don't have much choice :-(.
+	 * Also, ICU does not support single-character case folding, so we go the
+	 * long way.
 	 */
 
-	if (pg_database_encoding_max_length() > 1)
+	if (pg_database_encoding_max_length() > 1 || locale->provider == COLLPROVIDER_ICU)
 	{
 		/* lower's result is never packed, so OK to use old macros here */
 		pat = DatumGetTextPP(DirectFunctionCall1Coll(lower, collation,
@@ -190,31 +212,6 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 	}
 	else
 	{
-		/*
-		 * Here we need to prepare locale information for SB_lower_char. This
-		 * should match the methods used in str_tolower().
-		 */
-		pg_locale_t locale = 0;
-		bool		locale_is_c = false;
-
-		if (lc_ctype_is_c(collation))
-			locale_is_c = true;
-		else if (collation != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collation))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for ILIKE"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-			locale = pg_newlocale_from_collation(collation);
-		}
-
 		p = VARDATA_ANY(pat);
 		plen = VARSIZE_ANY_EXHDR(pat);
 		s = VARDATA_ANY(str);
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index ab197025f8..4a10655c73 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -57,11 +57,17 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_control.h"
 #include "mb/pg_wchar.h"
+#include "utils/builtins.h"
 #include "utils/hsearch.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
 #include "utils/syscache.h"
 
+#ifdef USE_ICU
+#include <unicode/ucnv.h>
+#endif
+
 #ifdef WIN32
 /*
  * This Windows file defines StrNCpy. We don't need it here, so we undefine
@@ -1272,12 +1278,13 @@ pg_newlocale_from_collation(Oid collid)
 	if (cache_entry->locale == 0)
 	{
 		/* We haven't computed this yet in this session, so do it */
-#ifdef HAVE_LOCALE_T
 		HeapTuple	tp;
 		Form_pg_collation collform;
 		const char *collcollate;
-		const char *collctype;
-		locale_t	result;
+		const char *collctype pg_attribute_unused();
+		pg_locale_t	result;
+		Datum		collversion;
+		bool		isnull;
 
 		tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collid));
 		if (!HeapTupleIsValid(tp))
@@ -1287,61 +1294,223 @@ pg_newlocale_from_collation(Oid collid)
 		collcollate = NameStr(collform->collcollate);
 		collctype = NameStr(collform->collctype);
 
-		if (strcmp(collcollate, collctype) == 0)
+		result = malloc(sizeof(* result));
+		memset(result, 0, sizeof(* result));
+		result->provider = collform->collprovider;
+
+		if (collform->collprovider == COLLPROVIDER_LIBC)
 		{
-			/* Normal case where they're the same */
+#ifdef HAVE_LOCALE_T
+			locale_t	loc;
+
+			if (strcmp(collcollate, collctype) == 0)
+			{
+				/* Normal case where they're the same */
 #ifndef WIN32
-			result = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
-							   NULL);
+				loc = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
+								   NULL);
 #else
-			result = _create_locale(LC_ALL, collcollate);
+				loc = _create_locale(LC_ALL, collcollate);
 #endif
-			if (!result)
-				report_newlocale_failure(collcollate);
-		}
-		else
-		{
+				if (!loc)
+					report_newlocale_failure(collcollate);
+			}
+			else
+			{
 #ifndef WIN32
-			/* We need two newlocale() steps */
-			locale_t	loc1;
-
-			loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
-			if (!loc1)
-				report_newlocale_failure(collcollate);
-			result = newlocale(LC_CTYPE_MASK, collctype, loc1);
-			if (!result)
-				report_newlocale_failure(collctype);
+				/* We need two newlocale() steps */
+				locale_t	loc1;
+
+				loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
+				if (!loc1)
+					report_newlocale_failure(collcollate);
+				loc = newlocale(LC_CTYPE_MASK, collctype, loc1);
+				if (!loc)
+					report_newlocale_failure(collctype);
 #else
 
-			/*
-			 * XXX The _create_locale() API doesn't appear to support this.
-			 * Could perhaps be worked around by changing pg_locale_t to
-			 * contain two separate fields.
-			 */
+				/*
+				 * XXX The _create_locale() API doesn't appear to support this.
+				 * Could perhaps be worked around by changing pg_locale_t to
+				 * contain two separate fields.
+				 */
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("collations with different collate and ctype values are not supported on this platform")));
+#endif
+			}
+
+			result->info.lt = loc;
+#else							/* not HAVE_LOCALE_T */
+			/* platform that doesn't support locale_t */
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					 errmsg("collations with different collate and ctype values are not supported on this platform")));
-#endif
+					 errmsg("collation provider LIBC is not supported on this platform")));
+#endif   /* not HAVE_LOCALE_T */
+		}
+		else if (collform->collprovider == COLLPROVIDER_ICU)
+		{
+#ifdef USE_ICU
+			UCollator  *collator;
+			UErrorCode	status;
+
+			status = U_ZERO_ERROR;
+			collator = ucol_open(collcollate, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not open collator for locale \"%s\": %s",
+								collcollate, u_errorName(status))));
+
+			result->info.icu.locale = strdup(collcollate);
+			result->info.icu.ucol = collator;
+#else /* not USE_ICU */
+			/* could get here if a collation was created by a build with ICU */
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("ICU is not supported in this build"), \
+					 errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif /* not USE_ICU */
 		}
 
-		cache_entry->locale = result;
+		collversion = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collversion,
+									  &isnull);
+		if (!isnull)
+		{
+			char	   *sysversionstr;
+			char	   *collversionstr;
+
+			sysversionstr = get_system_collation_version(collform->collprovider, collcollate);
+			collversionstr = TextDatumGetCString(collversion);
+
+			if (strcmp(sysversionstr, collversionstr) != 0)
+				ereport(WARNING,
+						(errmsg("collation \"%s\" has version mismatch",
+								NameStr(collform->collname)),
+						 errdetail("The collation in the database was created using version %s, "
+								   "but the operating system provides version %s.",
+								   collversionstr, sysversionstr),
+						 errhint("Rebuild all objects affected by this collation and run "
+								 "ALTER COLLATION %s REFRESH VERSION, "
+								 "or build PostgreSQL with the right library version.",
+								 quote_qualified_identifier(get_namespace_name(collform->collnamespace),
+															NameStr(collform->collname)))));
+		}
 
 		ReleaseSysCache(tp);
-#else							/* not HAVE_LOCALE_T */
 
-		/*
-		 * For platforms that don't support locale_t, we can't do anything
-		 * with non-default collations.
-		 */
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-		errmsg("nondefault collations are not supported on this platform")));
-#endif   /* not HAVE_LOCALE_T */
+		cache_entry->locale = result;
 	}
 
 	return cache_entry->locale;
 }
 
+/*
+ * Get provider-specific collation version string for the given collation from
+ * the operating system/library.
+ *
+ * A particular provider must always either return a non-NULL string or return
+ * NULL (if it doesn't support versions).  It must not return NULL for some
+ * collcollate and not NULL for others.
+ */
+char *
+get_system_collation_version(char collprovider, const char *collcollate)
+{
+	char	   *collversion;
+
+#ifdef USE_ICU
+	if (collprovider == COLLPROVIDER_ICU)
+	{
+		UCollator  *collator;
+		UErrorCode	status;
+		UVersionInfo versioninfo;
+		char		buf[U_MAX_VERSION_STRING_LENGTH];
+
+		status = U_ZERO_ERROR;
+		collator = ucol_open(collcollate, &status);
+		if (U_FAILURE(status))
+			ereport(ERROR,
+					(errmsg("could not open collator for locale \"%s\": %s",
+							collcollate, u_errorName(status))));
+		ucol_getVersion(collator, versioninfo);
+		ucol_close(collator);
+
+		u_versionToString(versioninfo, buf);
+		collversion = pstrdup(buf);
+	}
+	else
+#endif
+		collversion = NULL;
+
+	return collversion;
+}
+
+
+#ifdef USE_ICU
+/*
+ * Converter object for converting between ICU's UChar strings and C strings
+ * in database encoding.  Since the database encoding doesn't change, we only
+ * need one of these per session.
+ */
+static UConverter *icu_converter = NULL;
+
+static void
+init_icu_converter(void)
+{
+	const char *icu_encoding_name;
+	UErrorCode	status;
+	UConverter *conv;
+
+	if (icu_converter)
+		return;
+
+	icu_encoding_name = get_encoding_name_for_icu(GetDatabaseEncoding());
+
+	status = U_ZERO_ERROR;
+	conv = ucnv_open(icu_encoding_name, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("could not open ICU converter for encoding \"%s\": %s",
+						icu_encoding_name, u_errorName(status))));
+
+	icu_converter = conv;
+}
+
+int32_t
+icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes)
+{
+	UErrorCode	status;
+	int32_t		len_uchar;
+
+	init_icu_converter();
+
+	len_uchar = 2 * nbytes;  /* max length per docs */
+	*buff_uchar = palloc(len_uchar * sizeof(**buff_uchar));
+	status = U_ZERO_ERROR;
+	len_uchar = ucnv_toUChars(icu_converter, *buff_uchar, len_uchar, buff, nbytes, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_toUChars failed: %s", u_errorName(status))));
+	return len_uchar;
+}
+
+int32_t
+icu_from_uchar(char **result, UChar *buff_uchar, int32_t len_uchar)
+{
+	UErrorCode	status;
+	int32_t		len_result;
+
+	init_icu_converter();
+
+	len_result = UCNV_GET_MAX_BYTES_FOR_STRING(len_uchar, ucnv_getMaxCharSize(icu_converter));
+	*result = palloc(len_result + 1);
+	status = U_ZERO_ERROR;
+	ucnv_fromUChars(icu_converter, *result, len_result, buff_uchar, len_uchar, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_fromUChars failed: %s", u_errorName(status))));
+	return len_result;
+}
+#endif
 
 /*
  * These functions convert from/to libc's wchar_t, *not* pg_wchar_t.
@@ -1362,6 +1531,8 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1398,10 +1569,10 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_WCSTOMBS_L
 		/* Use wcstombs_l for nondefault locales */
-		result = wcstombs_l(to, from, tolen, locale);
+		result = wcstombs_l(to, from, tolen, locale->info.lt);
 #else							/* !HAVE_WCSTOMBS_L */
 		/* We have to temporarily set the locale as current ... ugh */
-		locale_t	save_locale = uselocale(locale);
+		locale_t	save_locale = uselocale(locale->info.lt);
 
 		result = wcstombs(to, from, tolen);
 
@@ -1432,6 +1603,8 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1473,10 +1646,10 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_MBSTOWCS_L
 			/* Use mbstowcs_l for nondefault locales */
-			result = mbstowcs_l(to, str, tolen, locale);
+			result = mbstowcs_l(to, str, tolen, locale->info.lt);
 #else							/* !HAVE_MBSTOWCS_L */
 			/* We have to temporarily set the locale as current ... ugh */
-			locale_t	save_locale = uselocale(locale);
+			locale_t	save_locale = uselocale(locale->info.lt);
 
 			result = mbstowcs(to, str, tolen);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 04bd9b95b2..c83bb39a12 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5258,7 +5258,7 @@ find_join_input_rel(PlannerInfo *root, Relids relids)
 /*
  * Check whether char is a letter (and, hence, subject to case-folding)
  *
- * In multibyte character sets, we can't use isalpha, and it does not seem
+ * In multibyte character sets or with ICU, we can't use isalpha, and it does not seem
  * worth trying to convert to wchar_t to use iswalpha.  Instead, just assume
  * any multibyte char is potentially case-varying.
  */
@@ -5270,9 +5270,11 @@ pattern_char_isalpha(char c, bool is_multibyte,
 		return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
 	else if (is_multibyte && IS_HIGHBIT_SET(c))
 		return true;
+	else if (locale && locale->provider == COLLPROVIDER_ICU)
+		return IS_HIGHBIT_SET(c) ? true : false;
 #ifdef HAVE_LOCALE_T
-	else if (locale)
-		return isalpha_l((unsigned char) c, locale);
+	else if (locale && locale->provider == COLLPROVIDER_LIBC)
+		return isalpha_l((unsigned char) c, locale->info.lt);
 #endif
 	else
 		return isalpha((unsigned char) c);
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index cd036afc00..aa556aa5de 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -73,9 +73,7 @@ typedef struct
 	hyperLogLogState abbr_card; /* Abbreviated key cardinality state */
 	hyperLogLogState full_card; /* Full key cardinality state */
 	double		prop_card;		/* Required cardinality proportion */
-#ifdef HAVE_LOCALE_T
 	pg_locale_t locale;
-#endif
 } VarStringSortSupport;
 
 /*
@@ -1403,10 +1401,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		char		a2buf[TEXTBUFLEN];
 		char	   *a1p,
 				   *a2p;
-
-#ifdef HAVE_LOCALE_T
 		pg_locale_t mylocale = 0;
-#endif
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1421,9 +1416,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			mylocale = pg_newlocale_from_collation(collid);
-#endif
 		}
 
 		/*
@@ -1542,11 +1535,54 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		memcpy(a2p, arg2, len2);
 		a2p[len2] = '\0';
 
-#ifdef HAVE_LOCALE_T
 		if (mylocale)
-			result = strcoll_l(a1p, a2p, mylocale);
-		else
+		{
+			if (mylocale->provider == COLLPROVIDER_ICU)
+			{
+#ifdef USE_ICU
+#ifdef HAVE_UCOL_STRCOLLUTF8
+				if (GetDatabaseEncoding() == PG_UTF8)
+				{
+					UErrorCode	status;
+
+					status = U_ZERO_ERROR;
+					result = ucol_strcollUTF8(mylocale->info.icu.ucol,
+											  arg1, len1,
+											  arg2, len2,
+											  &status);
+					if (U_FAILURE(status))
+						ereport(ERROR,
+								(errmsg("collation failed: %s", u_errorName(status))));
+				}
+				else
+#endif
+				{
+					int32_t ulen1, ulen2;
+					UChar *uchar1, *uchar2;
+
+					ulen1 = icu_to_uchar(&uchar1, arg1, len1);
+					ulen2 = icu_to_uchar(&uchar2, arg2, len2);
+
+					result = ucol_strcoll(mylocale->info.icu.ucol,
+										  uchar1, ulen1,
+										  uchar2, ulen2);
+				}
+#else	/* not USE_ICU */
+				/* shouldn't happen */
+				elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
+#endif	/* not USE_ICU */
+			}
+			else
+			{
+#ifdef HAVE_LOCALE_T
+				result = strcoll_l(a1p, a2p, mylocale->info.lt);
+#else
+				/* shouldn't happen */
+				elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
 #endif
+			}
+		}
+		else
 			result = strcoll(a1p, a2p);
 
 		/*
@@ -1768,10 +1804,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	bool		abbreviate = ssup->abbreviate;
 	bool		collate_c = false;
 	VarStringSortSupport *sss;
-
-#ifdef HAVE_LOCALE_T
 	pg_locale_t locale = 0;
-#endif
 
 	/*
 	 * If possible, set ssup->comparator to a function which can be used to
@@ -1826,9 +1859,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			locale = pg_newlocale_from_collation(collid);
-#endif
 		}
 	}
 
@@ -1854,7 +1885,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	 * platforms.
 	 */
 #ifndef TRUST_STRXFRM
-	if (!collate_c)
+	if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
 		abbreviate = false;
 #endif
 
@@ -1877,9 +1908,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 		sss->last_len2 = -1;
 		/* Initialize */
 		sss->last_returned = 0;
-#ifdef HAVE_LOCALE_T
 		sss->locale = locale;
-#endif
 
 		/*
 		 * To avoid somehow confusing a strxfrm() blob and an original string,
@@ -2090,11 +2119,54 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 		goto done;
 	}
 
-#ifdef HAVE_LOCALE_T
 	if (sss->locale)
-		result = strcoll_l(sss->buf1, sss->buf2, sss->locale);
-	else
+	{
+		if (sss->locale->provider == COLLPROVIDER_ICU)
+		{
+#ifdef USE_ICU
+#ifdef HAVE_UCOL_STRCOLLUTF8
+			if (GetDatabaseEncoding() == PG_UTF8)
+			{
+				UErrorCode	status;
+
+				status = U_ZERO_ERROR;
+				result = ucol_strcollUTF8(sss->locale->info.icu.ucol,
+										  a1p, len1,
+										  a2p, len2,
+										  &status);
+				if (U_FAILURE(status))
+					ereport(ERROR,
+							(errmsg("collation failed: %s", u_errorName(status))));
+			}
+			else
 #endif
+			{
+				int32_t ulen1, ulen2;
+				UChar *uchar1, *uchar2;
+
+				ulen1 = icu_to_uchar(&uchar1, a1p, len1);
+				ulen2 = icu_to_uchar(&uchar2, a2p, len2);
+
+				result = ucol_strcoll(sss->locale->info.icu.ucol,
+									  uchar1, ulen1,
+									  uchar2, ulen2);
+			}
+#else	/* not USE_ICU */
+			/* shouldn't happen */
+			elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+#endif	/* not USE_ICU */
+		}
+		else
+		{
+#ifdef HAVE_LOCALE_T
+			result = strcoll_l(sss->buf1, sss->buf2, sss->locale->info.lt);
+#else
+			/* shouldn't happen */
+			elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+#endif
+		}
+	}
+	else
 		result = strcoll(sss->buf1, sss->buf2);
 
 	/*
@@ -2200,9 +2272,14 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 	else
 	{
 		Size		bsize;
+#ifdef USE_ICU
+		int32_t		ulen = -1;
+		UChar	   *uchar;
+#endif
 
 		/*
-		 * We're not using the C collation, so fall back on strxfrm.
+		 * We're not using the C collation, so fall back on strxfrm or ICU
+		 * analogs.
 		 */
 
 		/* By convention, we use buffer 1 to store and NUL-terminate */
@@ -2222,17 +2299,66 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 			goto done;
 		}
 
-		/* Just like strcoll(), strxfrm() expects a NUL-terminated string */
 		memcpy(sss->buf1, authoritative_data, len);
+		/* Just like strcoll(), strxfrm() expects a NUL-terminated string.
+		 * Not necessary for ICU, but doesn't hurt. */
 		sss->buf1[len] = '\0';
 		sss->last_len1 = len;
 
+#ifdef USE_ICU
+		/* When using ICU and not UTF8, convert string to UChar. */
+		if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU &&
+			GetDatabaseEncoding() != PG_UTF8)
+			ulen = icu_to_uchar(&uchar, sss->buf1, len);
+#endif
+
+		/*
+		 * Loop: Call strxfrm() or ucol_getSortKey(), possibly enlarge buffer,
+		 * and try again.  Both of these functions have the result buffer
+		 * content undefined if the result did not fit, so we need to retry
+		 * until everything fits, even though we only need the first few bytes
+		 * in the end.  When using ucol_nextSortKeyPart(), however, we only
+		 * ask for as many bytes as we actually need.
+		 */
 		for (;;)
 		{
+#ifdef USE_ICU
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+			{
+				/*
+				 * When using UTF8, use the iteration interface so we only
+				 * need to produce as many bytes as we actually need.
+				 */
+				if (GetDatabaseEncoding() == PG_UTF8)
+				{
+					UCharIterator iter;
+					uint32_t	state[2];
+					UErrorCode	status;
+
+					uiter_setUTF8(&iter, sss->buf1, len);
+					state[0] = state[1] = 0;  /* won't need that again */
+					status = U_ZERO_ERROR;
+					bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol,
+												 &iter,
+												 state,
+												 (uint8_t *) sss->buf2,
+												 Min(sizeof(Datum), sss->buflen2),
+												 &status);
+					if (U_FAILURE(status))
+						ereport(ERROR,
+								(errmsg("sort key generation failed: %s", u_errorName(status))));
+				}
+				else
+					bsize = ucol_getSortKey(sss->locale->info.icu.ucol,
+											uchar, ulen,
+											(uint8_t *) sss->buf2, sss->buflen2);
+			}
+			else
+#endif
 #ifdef HAVE_LOCALE_T
-			if (sss->locale)
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_LIBC)
 				bsize = strxfrm_l(sss->buf2, sss->buf1,
-								  sss->buflen2, sss->locale);
+								  sss->buflen2, sss->locale->info.lt);
 			else
 #endif
 				bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
@@ -2242,8 +2368,7 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 				break;
 
 			/*
-			 * The C standard states that the contents of the buffer is now
-			 * unspecified.  Grow buffer, and retry.
+			 * Grow buffer and retry.
 			 */
 			pfree(sss->buf2);
 			sss->buflen2 = Max(bsize + 1,
diff --git a/src/backend/utils/mb/encnames.c b/src/backend/utils/mb/encnames.c
index 11099b844f..444eec25b5 100644
--- a/src/backend/utils/mb/encnames.c
+++ b/src/backend/utils/mb/encnames.c
@@ -403,6 +403,82 @@ const pg_enc2gettext pg_enc2gettext_tbl[] =
 };
 
 
+#ifndef FRONTEND
+
+/*
+ * Table of encoding names for ICU
+ *
+ * Reference: <https://ssl.icu-project.org/icu-bin/convexp>
+ *
+ * NULL entries are not supported by ICU, or their mapping is unclear.
+ */
+static const char * const pg_enc2icu_tbl[] =
+{
+	NULL,					/* PG_SQL_ASCII */
+	"EUC-JP",				/* PG_EUC_JP */
+	"EUC-CN",				/* PG_EUC_CN */
+	"EUC-KR",				/* PG_EUC_KR */
+	"EUC-TW",				/* PG_EUC_TW */
+	NULL,					/* PG_EUC_JIS_2004 */
+	"UTF-8",				/* PG_UTF8 */
+	NULL,					/* PG_MULE_INTERNAL */
+	"ISO-8859-1",			/* PG_LATIN1 */
+	"ISO-8859-2",			/* PG_LATIN2 */
+	"ISO-8859-3",			/* PG_LATIN3 */
+	"ISO-8859-4",			/* PG_LATIN4 */
+	"ISO-8859-9",			/* PG_LATIN5 */
+	"ISO-8859-10",			/* PG_LATIN6 */
+	"ISO-8859-13",			/* PG_LATIN7 */
+	"ISO-8859-14",			/* PG_LATIN8 */
+	"ISO-8859-15",			/* PG_LATIN9 */
+	NULL,					/* PG_LATIN10 */
+	"CP1256",				/* PG_WIN1256 */
+	"CP1258",				/* PG_WIN1258 */
+	"CP866",				/* PG_WIN866 */
+	NULL,					/* PG_WIN874 */
+	"KOI8-R",				/* PG_KOI8R */
+	"CP1251",				/* PG_WIN1251 */
+	"CP1252",				/* PG_WIN1252 */
+	"ISO-8859-5",			/* PG_ISO_8859_5 */
+	"ISO-8859-6",			/* PG_ISO_8859_6 */
+	"ISO-8859-7",			/* PG_ISO_8859_7 */
+	"ISO-8859-8",			/* PG_ISO_8859_8 */
+	"CP1250",				/* PG_WIN1250 */
+	"CP1253",				/* PG_WIN1253 */
+	"CP1254",				/* PG_WIN1254 */
+	"CP1255",				/* PG_WIN1255 */
+	"CP1257",				/* PG_WIN1257 */
+	"KOI8-U",				/* PG_KOI8U */
+};
+
+bool
+is_encoding_supported_by_icu(int encoding)
+{
+	return (pg_enc2icu_tbl[encoding] != NULL);
+}
+
+const char *
+get_encoding_name_for_icu(int encoding)
+{
+	const char *icu_encoding_name;
+
+	StaticAssertStmt(lengthof(pg_enc2icu_tbl) == PG_ENCODING_BE_LAST + 1,
+					 "pg_enc2icu_tbl incomplete");
+
+	icu_encoding_name = pg_enc2icu_tbl[encoding];
+
+	if (!icu_encoding_name)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("encoding \"%s\" not supported by ICU",
+						pg_encoding_to_char(encoding))));
+
+	return icu_encoding_name;
+}
+
+#endif /* not FRONTEND */
+
+
 /* ----------
  * Encoding checks, for error returns -1 else encoding id
  * ----------
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 9b71e0042d..784fe90376 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -62,6 +62,7 @@
 #include "catalog/catalog.h"
 #include "catalog/pg_authid.h"
 #include "catalog/pg_class.h"
+#include "catalog/pg_collation.h"
 #include "common/file_utils.h"
 #include "common/restricted_token.h"
 #include "common/username.h"
@@ -1629,7 +1630,7 @@ setup_collation(FILE *cmdfd)
 	PG_CMD_PUTS("SELECT pg_import_system_collations(if_not_exists => false, schema => 'pg_catalog');\n\n");
 
 	/* Add an SQL-standard name */
-	PG_CMD_PRINTF2("INSERT INTO pg_collation (collname, collnamespace, collowner, collencoding, collcollate, collctype) VALUES ('ucs_basic', 'pg_catalog'::regnamespace, %u, %d, 'C', 'C');\n\n", BOOTSTRAP_SUPERUSERID, PG_UTF8);
+	PG_CMD_PRINTF3("INSERT INTO pg_collation (collname, collnamespace, collowner, collprovider, collencoding, collcollate, collctype) VALUES ('ucs_basic', 'pg_catalog'::regnamespace, %u, '%c', %d, 'C', 'C');\n\n", BOOTSTRAP_SUPERUSERID, COLLPROVIDER_LIBC, PG_UTF8);
 }
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e67171dccb..32a99ce2a7 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -12806,8 +12806,10 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	PQExpBuffer delq;
 	PQExpBuffer labelq;
 	PGresult   *res;
+	int			i_collprovider;
 	int			i_collcollate;
 	int			i_collctype;
+	const char *collprovider;
 	const char *collcollate;
 	const char *collctype;
 
@@ -12824,18 +12826,30 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	selectSourceSchema(fout, collinfo->dobj.namespace->dobj.name);
 
 	/* Get collation-specific details */
-	appendPQExpBuffer(query, "SELECT "
-					  "collcollate, "
-					  "collctype "
-					  "FROM pg_catalog.pg_collation c "
-					  "WHERE c.oid = '%u'::pg_catalog.oid",
-					  collinfo->dobj.catId.oid);
+	if (fout->remoteVersion >= 100000)
+		appendPQExpBuffer(query, "SELECT "
+						  "collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
+	else
+		appendPQExpBuffer(query, "SELECT "
+						  "'p'::char AS collprovider, "
+						  "collcollate, "
+						  "collctype "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
 
 	res = ExecuteSqlQueryForSingleRow(fout, query->data);
 
+	i_collprovider = PQfnumber(res, "collprovider");
 	i_collcollate = PQfnumber(res, "collcollate");
 	i_collctype = PQfnumber(res, "collctype");
 
+	collprovider = PQgetvalue(res, 0, i_collprovider);
 	collcollate = PQgetvalue(res, 0, i_collcollate);
 	collctype = PQgetvalue(res, 0, i_collctype);
 
@@ -12847,11 +12861,32 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	appendPQExpBuffer(delq, ".%s;\n",
 					  fmtId(collinfo->dobj.name));
 
-	appendPQExpBuffer(q, "CREATE COLLATION %s (lc_collate = ",
+	appendPQExpBuffer(q, "CREATE COLLATION %s (",
 					  fmtId(collinfo->dobj.name));
-	appendStringLiteralAH(q, collcollate, fout);
-	appendPQExpBufferStr(q, ", lc_ctype = ");
-	appendStringLiteralAH(q, collctype, fout);
+
+	appendPQExpBufferStr(q, "provider = ");
+	if (collprovider[0] == 'c')
+		appendStringLiteralAH(q, "libc", fout);
+	else if (collprovider[0] == 'i')
+		appendStringLiteralAH(q, "icu", fout);
+	else
+		exit_horribly(NULL,
+					  "unrecognized collation provider: %s\n",
+					  collprovider);
+
+	if (strcmp(collcollate, collctype) == 0)
+	{
+		appendPQExpBufferStr(q, ", locale = ");
+		appendStringLiteralAH(q, collcollate, fout);
+	}
+	else
+	{
+		appendPQExpBufferStr(q, ", lc_collate = ");
+		appendStringLiteralAH(q, collcollate, fout);
+		appendPQExpBufferStr(q, ", lc_ctype = ");
+		appendStringLiteralAH(q, collctype, fout);
+	}
+
 	appendPQExpBufferStr(q, ");\n");
 
 	appendPQExpBuffer(labelq, "COLLATION %s", fmtId(collinfo->dobj.name));
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 61a3e2a848..8c583127fd 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3738,7 +3738,7 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false};
 
 	if (pset.sversion < 90100)
 	{
@@ -3762,6 +3762,11 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
 					  gettext_noop("Collate"),
 					  gettext_noop("Ctype"));
 
+	if (pset.sversion >= 100000)
+		appendPQExpBuffer(&buf,
+						  ",\n       CASE c.collprovider WHEN 'd' THEN 'default' WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\"",
+						  gettext_noop("Provider"));
+
 	if (verbose)
 		appendPQExpBuffer(&buf,
 						  ",\n       pg_catalog.obj_description(c.oid, 'pg_collation') AS \"%s\"",
diff --git a/src/include/catalog/pg_collation.h b/src/include/catalog/pg_collation.h
index 30c87e004e..8edd8aa066 100644
--- a/src/include/catalog/pg_collation.h
+++ b/src/include/catalog/pg_collation.h
@@ -34,9 +34,13 @@ CATALOG(pg_collation,3456)
 	NameData	collname;		/* collation name */
 	Oid			collnamespace;	/* OID of namespace containing collation */
 	Oid			collowner;		/* owner of collation */
+	char		collprovider;	/* see constants below */
 	int32		collencoding;	/* encoding for this collation; -1 = "all" */
 	NameData	collcollate;	/* LC_COLLATE setting */
 	NameData	collctype;		/* LC_CTYPE setting */
+#ifdef CATALOG_VARLEN			/* variable-length fields start here */
+	text		collversion;	/* provider-dependent version of collation data */
+#endif
 } FormData_pg_collation;
 
 /* ----------------
@@ -50,27 +54,34 @@ typedef FormData_pg_collation *Form_pg_collation;
  *		compiler constants for pg_collation
  * ----------------
  */
-#define Natts_pg_collation				6
+#define Natts_pg_collation				8
 #define Anum_pg_collation_collname		1
 #define Anum_pg_collation_collnamespace 2
 #define Anum_pg_collation_collowner		3
-#define Anum_pg_collation_collencoding	4
-#define Anum_pg_collation_collcollate	5
-#define Anum_pg_collation_collctype		6
+#define Anum_pg_collation_collprovider	4
+#define Anum_pg_collation_collencoding	5
+#define Anum_pg_collation_collcollate	6
+#define Anum_pg_collation_collctype		7
+#define Anum_pg_collation_collversion	8
 
 /* ----------------
  *		initial contents of pg_collation
  * ----------------
  */
 
-DATA(insert OID = 100 ( default		PGNSP PGUID -1 "" "" ));
+DATA(insert OID = 100 ( default		PGNSP PGUID d -1 "" "" 0 ));
 DESCR("database's default collation");
 #define DEFAULT_COLLATION_OID	100
-DATA(insert OID = 950 ( C			PGNSP PGUID -1 "C" "C" ));
+DATA(insert OID = 950 ( C			PGNSP PGUID c -1 "C" "C" 0 ));
 DESCR("standard C collation");
 #define C_COLLATION_OID			950
-DATA(insert OID = 951 ( POSIX		PGNSP PGUID -1 "POSIX" "POSIX" ));
+DATA(insert OID = 951 ( POSIX		PGNSP PGUID c -1 "POSIX" "POSIX" 0 ));
 DESCR("standard POSIX collation");
 #define POSIX_COLLATION_OID		951
 
+
+#define COLLPROVIDER_DEFAULT	'd'
+#define COLLPROVIDER_ICU		'i'
+#define COLLPROVIDER_LIBC		'c'
+
 #endif   /* PG_COLLATION_H */
diff --git a/src/include/catalog/pg_collation_fn.h b/src/include/catalog/pg_collation_fn.h
index 482ba7920e..2a50e54124 100644
--- a/src/include/catalog/pg_collation_fn.h
+++ b/src/include/catalog/pg_collation_fn.h
@@ -16,6 +16,7 @@
 
 extern Oid CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
 				const char *collcollate, const char *collctype,
 				bool if_not_exists);
diff --git a/src/include/commands/collationcmds.h b/src/include/commands/collationcmds.h
index 3b2fcb8271..df5623ccb6 100644
--- a/src/include/commands/collationcmds.h
+++ b/src/include/commands/collationcmds.h
@@ -20,5 +20,6 @@
 
 extern ObjectAddress DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_exists);
 extern void IsThereCollationInNamespace(const char *collname, Oid nspOid);
+extern ObjectAddress AlterCollation(AlterCollationStmt *stmt);
 
 #endif   /* COLLATIONCMDS_H */
diff --git a/src/include/mb/pg_wchar.h b/src/include/mb/pg_wchar.h
index 5f54697302..9c5e749c9e 100644
--- a/src/include/mb/pg_wchar.h
+++ b/src/include/mb/pg_wchar.h
@@ -333,6 +333,12 @@ typedef struct pg_enc2gettext
 extern const pg_enc2gettext pg_enc2gettext_tbl[];
 
 /*
+ * Encoding names for ICU
+ */
+extern bool is_encoding_supported_by_icu(int encoding);
+extern const char *get_encoding_name_for_icu(int encoding);
+
+/*
  * pg_wchar stuff
  */
 typedef int (*mb2wchar_with_len_converter) (const unsigned char *from,
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 2bc7a5df11..5d6d05385c 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -423,6 +423,7 @@ typedef enum NodeTag
 	T_CreateSubscriptionStmt,
 	T_AlterSubscriptionStmt,
 	T_DropSubscriptionStmt,
+	T_AlterCollationStmt,
 
 	/*
 	 * TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index a44d2178e1..4a3bbdd822 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1733,6 +1733,17 @@ typedef struct AlterTableCmd	/* one subcommand of an ALTER TABLE */
 
 
 /* ----------------------
+ * Alter Collation
+ * ----------------------
+ */
+typedef struct AlterCollationStmt
+{
+	NodeTag		type;
+	List	   *collname;
+} AlterCollationStmt;
+
+
+/* ----------------------
  *	Alter Domain
  *
  * The fields are used in different ways by the different variants of
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 5bcd8a1160..0ca708bbdf 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -603,6 +603,9 @@
 /* Define to 1 if you have the external array `tzname'. */
 #undef HAVE_TZNAME
 
+/* Define to 1 if you have the `ucol_strcollUTF8' function. */
+#undef HAVE_UCOL_STRCOLLUTF8
+
 /* Define to 1 if you have the <ucred.h> header file. */
 #undef HAVE_UCRED_H
 
@@ -816,6 +819,9 @@
    (--enable-float8-byval) */
 #undef USE_FLOAT8_BYVAL
 
+/* Define to build with ICU support. (--with-icu) */
+#undef USE_ICU
+
 /* Define to 1 to build with LDAP support. (--with-ldap) */
 #undef USE_LDAP
 
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index cb509e2b6b..e5f18f15b2 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -15,6 +15,9 @@
 #if defined(LOCALE_T_IN_XLOCALE) || defined(WCSTOMBS_L_IN_XLOCALE)
 #include <xlocale.h>
 #endif
+#ifdef USE_ICU
+#include <unicode/ucol.h>
+#endif
 
 #include "utils/guc.h"
 
@@ -61,17 +64,36 @@ extern void cache_locale_time(void);
  * We define our own wrapper around locale_t so we can keep the same
  * function signatures for all builds, while not having to create a
  * fake version of the standard type locale_t in the global namespace.
- * The fake version of pg_locale_t can be checked for truth; that's
- * about all it will be needed for.
+ * pg_locale_t is occasionally checked for truth, so make it a pointer.
  */
+struct pg_locale_t
+{
+	char	provider;
+	union
+	{
 #ifdef HAVE_LOCALE_T
-typedef locale_t pg_locale_t;
-#else
-typedef int pg_locale_t;
+		locale_t lt;
+#endif
+#ifdef USE_ICU
+		struct {
+			const char *locale;
+			UCollator *ucol;
+		} icu;
 #endif
+	} info;
+};
+
+typedef struct pg_locale_t *pg_locale_t;
 
 extern pg_locale_t pg_newlocale_from_collation(Oid collid);
 
+extern char *get_system_collation_version(char collprovider, const char *collcollate);
+
+#ifdef USE_ICU
+extern int32_t icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes);
+extern int32_t icu_from_uchar(char **result, UChar *buff_uchar, int32_t len_uchar);
+#endif
+
 /* These functions convert from/to libc's wchar_t, *not* pg_wchar_t */
 #ifdef USE_WIDE_UPPER_LOWER
 extern size_t wchar2char(char *to, const wchar_t *from, size_t tolen,
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index b923ea1420..a747facb9a 100644
--- a/src/test/regress/GNUmakefile
+++ b/src/test/regress/GNUmakefile
@@ -125,6 +125,9 @@ tablespace-setup:
 ##
 
 REGRESS_OPTS = --dlpath=. $(EXTRA_REGRESS_OPTS)
+ifeq ($(with_icu),yes)
+override EXTRA_TESTS := collate.icu $(EXTRA_TESTS)
+endif
 
 check: all tablespace-setup
 	$(pg_regress_check) $(REGRESS_OPTS) --schedule=$(srcdir)/parallel_schedule $(MAXCONNOPT) $(EXTRA_TESTS)
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.icu.out
similarity index 80%
copy from src/test/regress/expected/collate.linux.utf8.out
copy to src/test/regress/expected/collate.icu.out
index 293e78641e..a74577ade1 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.icu.out
@@ -1,54 +1,54 @@
 /*
- * This test is for Linux/glibc systems and assumes that a full set of
- * locales is installed.  It must be run in a database with UTF-8 encoding,
- * because other encodings don't support all the characters used.
+ * This test is for ICU collations.
  */
 SET client_encoding TO UTF8;
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
 CREATE TABLE collate_test1 (
     a int,
-    b text COLLATE "en_US" NOT NULL
+    b text COLLATE "en-x-icu" NOT NULL
 );
 \d collate_test1
-           Table "public.collate_test1"
+        Table "collate_tests.collate_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
- b      | text    | en_US     | not null | 
+ b      | text    | en-x-icu  | not null | 
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "ja_JP.eucjp"
+    b text COLLATE "ja_JP.eucjp-x-icu"
 );
-ERROR:  collation "ja_JP.eucjp" for encoding "UTF8" does not exist
-LINE 3:     b text COLLATE "ja_JP.eucjp"
+ERROR:  collation "ja_JP.eucjp-x-icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "ja_JP.eucjp-x-icu"
                    ^
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "foo"
+    b text COLLATE "foo-x-icu"
 );
-ERROR:  collation "foo" for encoding "UTF8" does not exist
-LINE 3:     b text COLLATE "foo"
+ERROR:  collation "foo-x-icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "foo-x-icu"
                    ^
 CREATE TABLE collate_test_fail (
-    a int COLLATE "en_US",
+    a int COLLATE "en-x-icu",
     b text
 );
 ERROR:  collations are not supported by type integer
-LINE 2:     a int COLLATE "en_US",
+LINE 2:     a int COLLATE "en-x-icu",
                   ^
 CREATE TABLE collate_test_like (
     LIKE collate_test1
 );
 \d collate_test_like
-         Table "public.collate_test_like"
+      Table "collate_tests.collate_test_like"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
- b      | text    | en_US     | not null | 
+ b      | text    | en-x-icu  | not null | 
 
 CREATE TABLE collate_test2 (
     a int,
-    b text COLLATE "sv_SE"
+    b text COLLATE "sv-x-icu"
 );
 CREATE TABLE collate_test3 (
     a int,
@@ -106,12 +106,12 @@ SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
  3 | bbc
 (2 rows)
 
-SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US";
-ERROR:  collation mismatch between explicit collations "C" and "en_US"
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en-x-icu";
+ERROR:  collation mismatch between explicit collations "C" and "en-x-icu"
 LINE 1: ...* FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "e...
                                                              ^
-CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE";
-CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE"; -- fails
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv-x-icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv-x-icu"; -- fails
 ERROR:  collations are not supported by type integer
 CREATE TABLE collate_test4 (
     a int,
@@ -129,7 +129,7 @@ SELECT a, b FROM collate_test4 ORDER BY b;
 
 CREATE TABLE collate_test5 (
     a int,
-    b testdomain_sv COLLATE "en_US"
+    b testdomain_sv COLLATE "en-x-icu"
 );
 INSERT INTO collate_test5 SELECT * FROM collate_test1;
 SELECT a, b FROM collate_test5 ORDER BY b;
@@ -206,13 +206,13 @@ SELECT * FROM collate_test3 ORDER BY b;
 (4 rows)
 
 -- constant expression folding
-SELECT 'bbc' COLLATE "en_US" > 'äbc' COLLATE "en_US" AS "true";
+SELECT 'bbc' COLLATE "en-x-icu" > 'äbc' COLLATE "en-x-icu" AS "true";
  true 
 ------
  t
 (1 row)
 
-SELECT 'bbc' COLLATE "sv_SE" > 'äbc' COLLATE "sv_SE" AS "false";
+SELECT 'bbc' COLLATE "sv-x-icu" > 'äbc' COLLATE "sv-x-icu" AS "false";
  false 
 -------
  f
@@ -221,8 +221,8 @@ SELECT 'bbc' COLLATE "sv_SE" > 'äbc' COLLATE "sv_SE" AS "false";
 -- upper/lower
 CREATE TABLE collate_test10 (
     a int,
-    x text COLLATE "en_US",
-    y text COLLATE "tr_TR"
+    x text COLLATE "en-x-icu",
+    y text COLLATE "tr-x-icu"
 );
 INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
 SELECT a, lower(x), lower(y), upper(x), upper(y), initcap(x), initcap(y) FROM collate_test10;
@@ -290,25 +290,25 @@ SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
  4 | ABC
 (4 rows)
 
-SELECT 'Türkiye' COLLATE "en_US" ILIKE '%KI%' AS "true";
+SELECT 'Türkiye' COLLATE "en-x-icu" ILIKE '%KI%' AS "true";
  true 
 ------
  t
 (1 row)
 
-SELECT 'Türkiye' COLLATE "tr_TR" ILIKE '%KI%' AS "false";
+SELECT 'Türkiye' COLLATE "tr-x-icu" ILIKE '%KI%' AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ILIKE 'BIT' COLLATE "en_US" AS "false";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "en-x-icu" AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ILIKE 'BIT' COLLATE "tr_TR" AS "true";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "tr-x-icu" AS "true";
  true 
 ------
  t
@@ -364,50 +364,75 @@ SELECT * FROM collate_test1 WHERE b ~* 'bc';
  4 | ABC
 (4 rows)
 
-SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en-x-icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+  b  | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space 
+-----+----------+----------+----------+----------+----------+----------+----------+----------+----------
+ abc | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ABC | t        | t        | f        | f        | t        | t        | t        | f        | f
+ 123 | f        | f        | f        | t        | t        | t        | t        | f        | f
+ ab1 | f        | f        | f        | f        | t        | t        | t        | f        | f
+ a1! | f        | f        | f        | f        | f        | t        | t        | f        | f
+ a c | f        | f        | f        | f        | f        | f        | t        | f        | f
+ !.; | f        | f        | f        | f        | f        | t        | t        | t        | f
+     | f        | f        | f        | f        | f        | f        | t        | f        | t
+ äbç | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ÄBÇ | t        | t        | f        | f        | t        | t        | t        | f        | f
+(10 rows)
+
+SELECT 'Türkiye' COLLATE "en-x-icu" ~* 'KI' AS "true";
  true 
 ------
  t
 (1 row)
 
-SELECT 'Türkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
+SELECT 'Türkiye' COLLATE "tr-x-icu" ~* 'KI' AS "true";  -- true with ICU
+ true 
+------
+ t
+(1 row)
+
+SELECT 'bıt' ~* 'BIT' COLLATE "en-x-icu" AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ~* 'BIT' COLLATE "en_US" AS "false";
+SELECT 'bıt' ~* 'BIT' COLLATE "tr-x-icu" AS "false";  -- false with ICU
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ~* 'BIT' COLLATE "tr_TR" AS "true";
- true 
-------
- t
-(1 row)
-
 -- The following actually exercises the selectivity estimation for ~*.
 SELECT relname FROM pg_class WHERE relname ~* '^abc';
  relname 
 ---------
 (0 rows)
 
+/* not run by default because it requires tr_TR system locale
 -- to_char
+
 SET lc_time TO 'tr_TR';
 SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
-   to_char   
--------------
- 01 NIS 2010
-(1 row)
-
-SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
-   to_char   
--------------
- 01 NİS 2010
-(1 row)
-
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr-x-icu");
+*/
 -- backwards parsing
 CREATE VIEW collview1 AS SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
 CREATE VIEW collview2 AS SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
@@ -693,7 +718,7 @@ SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- ok
 (8 rows)
 
 SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en-x-icu" and "C"
 LINE 1: SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collat...
                                                        ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
@@ -707,12 +732,12 @@ SELECT a, b COLLATE "C" FROM collate_test1 UNION SELECT a, b FROM collate_test3
 (4 rows)
 
 SELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en-x-icu" and "C"
 LINE 1: ...ELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM col...
                                                              ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
 SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en-x-icu" and "C"
 LINE 1: SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM colla...
                                                         ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
@@ -731,18 +756,18 @@ select x || y from collate_test10; -- ok, because || is not collation aware
 (2 rows)
 
 select x, y from collate_test10 order by x || y; -- not so ok
-ERROR:  collation mismatch between implicit collations "en_US" and "tr_TR"
+ERROR:  collation mismatch between implicit collations "en-x-icu" and "tr-x-icu"
 LINE 1: select x, y from collate_test10 order by x || y;
                                                       ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
 -- collation mismatch between recursive and non-recursive term
 WITH RECURSIVE foo(x) AS
-   (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+   (SELECT x FROM (VALUES('a' COLLATE "en-x-icu"),('b')) t(x)
    UNION ALL
-   SELECT (x || 'c') COLLATE "de_DE" FROM foo WHERE length(x) < 10)
+   SELECT (x || 'c') COLLATE "de-x-icu" FROM foo WHERE length(x) < 10)
 SELECT * FROM foo;
-ERROR:  recursive query "foo" column 1 has collation "en_US" in non-recursive term but collation "de_DE" overall
-LINE 2:    (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+ERROR:  recursive query "foo" column 1 has collation "en-x-icu" in non-recursive term but collation "de-x-icu" overall
+LINE 2:    (SELECT x FROM (VALUES('a' COLLATE "en-x-icu"),('b')) t(x...
                    ^
 HINT:  Use the COLLATE clause to set the collation of the non-recursive term.
 -- casting
@@ -843,7 +868,7 @@ begin
   return xx < yy;
 end
 $$;
-SELECT mylt2('a', 'B' collate "en_US") as t, mylt2('a', 'B' collate "C") as f;
+SELECT mylt2('a', 'B' collate "en-x-icu") as t, mylt2('a', 'B' collate "C") as f;
  t | f 
 ---+---
  t | f
@@ -957,29 +982,23 @@ CREATE SCHEMA test_schema;
 -- We need to do this this way to cope with varying names for encodings:
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test0 (locale = ' ||
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
           quote_literal(current_setting('lc_collate')) || ');';
 END
 $$;
 CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
-ERROR:  collation "test0" for encoding "UTF8" already exists
-CREATE COLLATION IF NOT EXISTS test0 FROM "C"; -- ok, skipped
-NOTICE:  collation "test0" for encoding "UTF8" already exists, skipping
-CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
-NOTICE:  collation "test0" for encoding "UTF8" already exists, skipping
+ERROR:  collation "test0" already exists
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
           quote_literal(current_setting('lc_collate')) ||
           ', lc_ctype = ' ||
           quote_literal(current_setting('lc_ctype')) || ');';
 END
 $$;
-CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
 ERROR:  parameter "lc_ctype" must be specified
-CREATE COLLATION testx (locale = 'nonsense'); -- fail
-ERROR:  could not create locale "nonsense": No such file or directory
-DETAIL:  The operating system could not find any locale data for the locale name "nonsense".
+CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails with ICU */  DROP COLLATION testx;
 CREATE COLLATION test4 FROM nonsense;
 ERROR:  collation "nonsense" for encoding "UTF8" does not exist
 CREATE COLLATION test5 FROM test0;
@@ -993,7 +1012,7 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
 
 ALTER COLLATION test1 RENAME TO test11;
 ALTER COLLATION test0 RENAME TO test11; -- fail
-ERROR:  collation "test11" for encoding "UTF8" already exists in schema "public"
+ERROR:  collation "test11" already exists in schema "collate_tests"
 ALTER COLLATION test1 RENAME TO test22; -- fail
 ERROR:  collation "test1" for encoding "UTF8" does not exist
 ALTER COLLATION test11 OWNER TO regress_test_role;
@@ -1005,11 +1024,11 @@ SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
     FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
     WHERE collname LIKE 'test%'
     ORDER BY 1;
- collname |   nspname   | obj_description 
-----------+-------------+-----------------
- test0    | public      | US English
- test11   | test_schema | 
- test5    | public      | 
+ collname |    nspname    | obj_description 
+----------+---------------+-----------------
+ test0    | collate_tests | US English
+ test11   | test_schema   | 
+ test5    | collate_tests | 
 (3 rows)
 
 DROP COLLATION test0, test_schema.test11, test5;
@@ -1024,6 +1043,9 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
 
 DROP SCHEMA test_schema;
 DROP ROLE regress_test_role;
+-- ALTER
+ALTER COLLATION "en-x-icu" REFRESH VERSION;
+NOTICE:  version has not changed
 -- dependencies
 CREATE COLLATION test0 FROM "C";
 CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
@@ -1048,13 +1070,13 @@ drop cascades to composite type collate_dep_test2 column y
 drop cascades to view collate_dep_test3
 drop cascades to index collate_dep_test4i
 \d collate_dep_test1
-         Table "public.collate_dep_test1"
+      Table "collate_tests.collate_dep_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
 
 \d collate_dep_test2
-     Composite type "public.collate_dep_test2"
+ Composite type "collate_tests.collate_dep_test2"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  x      | integer |           |          | 
@@ -1063,7 +1085,7 @@ DROP TABLE collate_dep_test1, collate_dep_test4t;
 DROP TYPE collate_dep_test2;
 -- test range types and collations
 create type textrange_c as range(subtype=text, collation="C");
-create type textrange_en_us as range(subtype=text, collation="en_US");
+create type textrange_en_us as range(subtype=text, collation="en-x-icu");
 select textrange_c('A','Z') @> 'b'::text;
  ?column? 
 ----------
@@ -1078,3 +1100,24 @@ select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
+NOTICE:  drop cascades to 18 other objects
+DETAIL:  drop cascades to table collate_test1
+drop cascades to table collate_test_like
+drop cascades to table collate_test2
+drop cascades to table collate_test3
+drop cascades to type testdomain_sv
+drop cascades to table collate_test4
+drop cascades to table collate_test5
+drop cascades to table collate_test10
+drop cascades to table collate_test6
+drop cascades to view collview1
+drop cascades to view collview2
+drop cascades to view collview3
+drop cascades to type testdomain
+drop cascades to function mylt(text,text)
+drop cascades to function mylt_noninline(text,text)
+drop cascades to function mylt_plpgsql(text,text)
+drop cascades to function mylt2(text,text)
+drop cascades to function dup(anyelement)
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out
index 293e78641e..332fb4dbbd 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.linux.utf8.out
@@ -4,12 +4,14 @@
  * because other encodings don't support all the characters used.
  */
 SET client_encoding TO UTF8;
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
 CREATE TABLE collate_test1 (
     a int,
     b text COLLATE "en_US" NOT NULL
 );
 \d collate_test1
-           Table "public.collate_test1"
+        Table "collate_tests.collate_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
@@ -40,7 +42,7 @@ CREATE TABLE collate_test_like (
     LIKE collate_test1
 );
 \d collate_test_like
-         Table "public.collate_test_like"
+      Table "collate_tests.collate_test_like"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
@@ -364,6 +366,38 @@ SELECT * FROM collate_test1 WHERE b ~* 'bc';
  4 | ABC
 (4 rows)
 
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+  b  | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space 
+-----+----------+----------+----------+----------+----------+----------+----------+----------+----------
+ abc | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ABC | t        | t        | f        | f        | t        | t        | t        | f        | f
+ 123 | f        | f        | f        | t        | t        | t        | t        | f        | f
+ ab1 | f        | f        | f        | f        | t        | t        | t        | f        | f
+ a1! | f        | f        | f        | f        | f        | t        | t        | f        | f
+ a c | f        | f        | f        | f        | f        | f        | t        | f        | f
+ !.; | f        | f        | f        | f        | f        | t        | t        | t        | f
+     | f        | f        | f        | f        | f        | f        | t        | f        | t
+ äbç | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ÄBÇ | t        | t        | f        | f        | t        | t        | t        | f        | f
+(10 rows)
+
 SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
  true 
 ------
@@ -993,7 +1027,7 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
 
 ALTER COLLATION test1 RENAME TO test11;
 ALTER COLLATION test0 RENAME TO test11; -- fail
-ERROR:  collation "test11" for encoding "UTF8" already exists in schema "public"
+ERROR:  collation "test11" for encoding "UTF8" already exists in schema "collate_tests"
 ALTER COLLATION test1 RENAME TO test22; -- fail
 ERROR:  collation "test1" for encoding "UTF8" does not exist
 ALTER COLLATION test11 OWNER TO regress_test_role;
@@ -1005,11 +1039,11 @@ SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
     FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
     WHERE collname LIKE 'test%'
     ORDER BY 1;
- collname |   nspname   | obj_description 
-----------+-------------+-----------------
- test0    | public      | US English
- test11   | test_schema | 
- test5    | public      | 
+ collname |    nspname    | obj_description 
+----------+---------------+-----------------
+ test0    | collate_tests | US English
+ test11   | test_schema   | 
+ test5    | collate_tests | 
 (3 rows)
 
 DROP COLLATION test0, test_schema.test11, test5;
@@ -1024,6 +1058,9 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
 
 DROP SCHEMA test_schema;
 DROP ROLE regress_test_role;
+-- ALTER
+ALTER COLLATION "en_US" REFRESH VERSION;
+NOTICE:  version has not changed
 -- dependencies
 CREATE COLLATION test0 FROM "C";
 CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
@@ -1048,13 +1085,13 @@ drop cascades to composite type collate_dep_test2 column y
 drop cascades to view collate_dep_test3
 drop cascades to index collate_dep_test4i
 \d collate_dep_test1
-         Table "public.collate_dep_test1"
+      Table "collate_tests.collate_dep_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
 
 \d collate_dep_test2
-     Composite type "public.collate_dep_test2"
+ Composite type "collate_tests.collate_dep_test2"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  x      | integer |           |          | 
@@ -1078,3 +1115,24 @@ select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
+NOTICE:  drop cascades to 18 other objects
+DETAIL:  drop cascades to table collate_test1
+drop cascades to table collate_test_like
+drop cascades to table collate_test2
+drop cascades to table collate_test3
+drop cascades to type testdomain_sv
+drop cascades to table collate_test4
+drop cascades to table collate_test5
+drop cascades to table collate_test10
+drop cascades to table collate_test6
+drop cascades to view collview1
+drop cascades to view collview2
+drop cascades to view collview3
+drop cascades to type testdomain
+drop cascades to function mylt(text,text)
+drop cascades to function mylt_noninline(text,text)
+drop cascades to function mylt_plpgsql(text,text)
+drop cascades to function mylt2(text,text)
+drop cascades to function dup(anyelement)
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.icu.sql
similarity index 81%
copy from src/test/regress/sql/collate.linux.utf8.sql
copy to src/test/regress/sql/collate.icu.sql
index c349cbde2b..0ea60d415c 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.icu.sql
@@ -1,31 +1,32 @@
 /*
- * This test is for Linux/glibc systems and assumes that a full set of
- * locales is installed.  It must be run in a database with UTF-8 encoding,
- * because other encodings don't support all the characters used.
+ * This test is for ICU collations.
  */
 
 SET client_encoding TO UTF8;
 
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
+
 
 CREATE TABLE collate_test1 (
     a int,
-    b text COLLATE "en_US" NOT NULL
+    b text COLLATE "en-x-icu" NOT NULL
 );
 
 \d collate_test1
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "ja_JP.eucjp"
+    b text COLLATE "ja_JP.eucjp-x-icu"
 );
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "foo"
+    b text COLLATE "foo-x-icu"
 );
 
 CREATE TABLE collate_test_fail (
-    a int COLLATE "en_US",
+    a int COLLATE "en-x-icu",
     b text
 );
 
@@ -37,7 +38,7 @@ CREATE TABLE collate_test_like (
 
 CREATE TABLE collate_test2 (
     a int,
-    b text COLLATE "sv_SE"
+    b text COLLATE "sv-x-icu"
 );
 
 CREATE TABLE collate_test3 (
@@ -57,11 +58,11 @@ CREATE TABLE collate_test3 (
 SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
 SELECT * FROM collate_test1 WHERE b >= 'bbc' COLLATE "C";
 SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
-SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US";
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en-x-icu";
 
 
-CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE";
-CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE"; -- fails
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv-x-icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv-x-icu"; -- fails
 CREATE TABLE collate_test4 (
     a int,
     b testdomain_sv
@@ -71,7 +72,7 @@ CREATE TABLE collate_test4 (
 
 CREATE TABLE collate_test5 (
     a int,
-    b testdomain_sv COLLATE "en_US"
+    b testdomain_sv COLLATE "en-x-icu"
 );
 INSERT INTO collate_test5 SELECT * FROM collate_test1;
 SELECT a, b FROM collate_test5 ORDER BY b;
@@ -89,15 +90,15 @@ CREATE TABLE collate_test5 (
 SELECT * FROM collate_test3 ORDER BY b;
 
 -- constant expression folding
-SELECT 'bbc' COLLATE "en_US" > 'äbc' COLLATE "en_US" AS "true";
-SELECT 'bbc' COLLATE "sv_SE" > 'äbc' COLLATE "sv_SE" AS "false";
+SELECT 'bbc' COLLATE "en-x-icu" > 'äbc' COLLATE "en-x-icu" AS "true";
+SELECT 'bbc' COLLATE "sv-x-icu" > 'äbc' COLLATE "sv-x-icu" AS "false";
 
 -- upper/lower
 
 CREATE TABLE collate_test10 (
     a int,
-    x text COLLATE "en_US",
-    y text COLLATE "tr_TR"
+    x text COLLATE "en-x-icu",
+    y text COLLATE "tr-x-icu"
 );
 
 INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
@@ -116,11 +117,11 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';
 SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
 
-SELECT 'Türkiye' COLLATE "en_US" ILIKE '%KI%' AS "true";
-SELECT 'Türkiye' COLLATE "tr_TR" ILIKE '%KI%' AS "false";
+SELECT 'Türkiye' COLLATE "en-x-icu" ILIKE '%KI%' AS "true";
+SELECT 'Türkiye' COLLATE "tr-x-icu" ILIKE '%KI%' AS "false";
 
-SELECT 'bıt' ILIKE 'BIT' COLLATE "en_US" AS "false";
-SELECT 'bıt' ILIKE 'BIT' COLLATE "tr_TR" AS "true";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "en-x-icu" AS "false";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "tr-x-icu" AS "true";
 
 -- The following actually exercises the selectivity estimation for ILIKE.
 SELECT relname FROM pg_class WHERE relname ILIKE 'abc%';
@@ -134,21 +135,42 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ~* '^abc';
 SELECT * FROM collate_test1 WHERE b ~* 'bc';
 
-SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
-SELECT 'Türkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
-
-SELECT 'bıt' ~* 'BIT' COLLATE "en_US" AS "false";
-SELECT 'bıt' ~* 'BIT' COLLATE "tr_TR" AS "true";
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en-x-icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+
+SELECT 'Türkiye' COLLATE "en-x-icu" ~* 'KI' AS "true";
+SELECT 'Türkiye' COLLATE "tr-x-icu" ~* 'KI' AS "true";  -- true with ICU
+
+SELECT 'bıt' ~* 'BIT' COLLATE "en-x-icu" AS "false";
+SELECT 'bıt' ~* 'BIT' COLLATE "tr-x-icu" AS "false";  -- false with ICU
 
 -- The following actually exercises the selectivity estimation for ~*.
 SELECT relname FROM pg_class WHERE relname ~* '^abc';
 
 
+/* not run by default because it requires tr_TR system locale
 -- to_char
 
 SET lc_time TO 'tr_TR';
 SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
-SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr-x-icu");
+*/
 
 
 -- backwards parsing
@@ -218,9 +240,9 @@ CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM
 
 -- collation mismatch between recursive and non-recursive term
 WITH RECURSIVE foo(x) AS
-   (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+   (SELECT x FROM (VALUES('a' COLLATE "en-x-icu"),('b')) t(x)
    UNION ALL
-   SELECT (x || 'c') COLLATE "de_DE" FROM foo WHERE length(x) < 10)
+   SELECT (x || 'c') COLLATE "de-x-icu" FROM foo WHERE length(x) < 10)
 SELECT * FROM foo;
 
 
@@ -268,7 +290,7 @@ CREATE FUNCTION mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
 end
 $$;
 
-SELECT mylt2('a', 'B' collate "en_US") as t, mylt2('a', 'B' collate "C") as f;
+SELECT mylt2('a', 'B' collate "en-x-icu") as t, mylt2('a', 'B' collate "C") as f;
 
 CREATE OR REPLACE FUNCTION
   mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
@@ -320,23 +342,21 @@ CREATE SCHEMA test_schema;
 -- We need to do this this way to cope with varying names for encodings:
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test0 (locale = ' ||
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
           quote_literal(current_setting('lc_collate')) || ');';
 END
 $$;
 CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
-CREATE COLLATION IF NOT EXISTS test0 FROM "C"; -- ok, skipped
-CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
           quote_literal(current_setting('lc_collate')) ||
           ', lc_ctype = ' ||
           quote_literal(current_setting('lc_ctype')) || ');';
 END
 $$;
-CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
-CREATE COLLATION testx (locale = 'nonsense'); -- fail
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails with ICU */  DROP COLLATION testx;
 
 CREATE COLLATION test4 FROM nonsense;
 CREATE COLLATION test5 FROM test0;
@@ -368,6 +388,11 @@ CREATE COLLATION test5 FROM test0;
 DROP ROLE regress_test_role;
 
 
+-- ALTER
+
+ALTER COLLATION "en-x-icu" REFRESH VERSION;
+
+
 -- dependencies
 
 CREATE COLLATION test0 FROM "C";
@@ -391,10 +416,14 @@ CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
 -- test range types and collations
 
 create type textrange_c as range(subtype=text, collation="C");
-create type textrange_en_us as range(subtype=text, collation="en_US");
+create type textrange_en_us as range(subtype=text, collation="en-x-icu");
 
 select textrange_c('A','Z') @> 'b'::text;
 select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+
+
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.linux.utf8.sql
index c349cbde2b..78c7ac2e44 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.linux.utf8.sql
@@ -6,6 +6,9 @@
 
 SET client_encoding TO UTF8;
 
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
+
 
 CREATE TABLE collate_test1 (
     a int,
@@ -134,6 +137,25 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ~* '^abc';
 SELECT * FROM collate_test1 WHERE b ~* 'bc';
 
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+
 SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
 SELECT 'Türkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
 
@@ -368,6 +390,11 @@ CREATE COLLATION test5 FROM test0;
 DROP ROLE regress_test_role;
 
 
+-- ALTER
+
+ALTER COLLATION "en_US" REFRESH VERSION;
+
+
 -- dependencies
 
 CREATE COLLATION test0 FROM "C";
@@ -398,3 +425,7 @@ CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
 
 drop type textrange_c;
 drop type textrange_en_us;
+
+
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
-- 
2.12.0

#75

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Peter Eisentraut (#74)

Re: ICU integration

On 03/15/2017 05:33 PM, Peter Eisentraut wrote:

Updated patch attached.

Ok, I applied to patch again and ran the tests. I get a test failure on
make check-world in the pg_dump tests but it can be fixed with the below.

diff --git a/src/bin/pg_dump/t/002_pg_dump.pl 
b/src/bin/pg_dump/t/002_pg_dump.pl
index 3cac4a9ae0..7d9c90363b 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -2422,7 +2422,7 @@ qr/^\QINSERT INTO test_fifth_table (col1, col2, 
col3, col4, col5) VALUES (NULL,
           'CREATE COLLATION test0 FROM "C";',
         regexp =>
           qr/^
-         \QCREATE COLLATION test0 (lc_collate = 'C', lc_ctype = 'C');\E/xm,
+         \QCREATE COLLATION test0 (provider = 'libc', locale = 'C');\E/xm,
         like => {
             binary_upgrade           => 1,
             clean                    => 1,

- I do not like how it is not obvious which is the default version of
every locale. E.g. I believe that "sv%icu" is "sv_reformed%icu" and not
"sv_standard%icu" as one might expect. Is this inherent to ICU or
something we can work around?

We get these keywords from ucol_getKeywordValuesForLocale(), which says
"Given a key and a locale, returns an array of string values in a
preferred order that would make a difference." So all those are
supposedly different from each other.

I believe you are mistaken. The locale "sv" is just an alias for
"sv-u-standard" as far as I can tell. See the definition of the Swedish
locale
(http://source.icu-project.org/repos/icu/trunk/icu4c/source/data/coll/sv.txt)
and there are just three collations: reformed (default), standard,
search (plus eot and emoji which are inherited).

I am also quite annoyed at col-emoji and col-eor (European Ordering
Rules). They are defined at the root and inherited by all languages, but
no language modifies col-emoji for their needs which makes it a foot
gun. See the Danish sorting example below where at least I expected the
same order. For col-eor it makes a bit more sense since I would expect
the locales en_GB-u-col-eot and sv_SE-u-col-eot to collate exactly the same.

# SELECT * FROM (VALUES ('a'), ('k'), ('aa')) q (x) ORDER BY x COLLATE
"da-x-icu";
x
----
a
k
aa
(3 rows)

# SELECT * FROM (VALUES ('a'), ('k'), ('aa')) q (x) ORDER BY x COLLATE
"da-u-co-emoji-x-icu";
x
----
a
aa
k
(3 rows)

It seems ICU has made quite the mess here, and I am not sure to what
degree we need to handle it to avoid our users getting confused. I need
to think some of it, and would love input from others. Maybe the right
thing to do is to ignore the issue with col-emoji, but I would want to
do something about the default collations.

- ICU talks about a new syntax for locale extensions (old:
de_DE@collation=phonebook, new: de_DE_u_co_phonebk) at this page
http://userguide.icu-project.org/collation/api. Is this something we
need to car about? It looks like we currently use the old format, and
while I personally prefer it I am not sure we should rely on an old syntax.

Interesting. I hadn't see this before, and the documentation is sparse.
But it seems that this refers to BCP 47 language tags, which seem like
a good idea.

So what I have done is change all the predefined ICU collations to be
named after the BCP 47 scheme, with a "private use" -x-icu suffix
(instead of %icu). The preserves the original idea but uses a standard
naming scheme.

I'm not terribly worried that they are going to remove the "old" locale
naming, but just to be forward looking, I have changed it so that the
collcollate entries are made using the "new" naming for ICU >=54.

Sounds good.

- I get an error when creating a new locale.

#CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR: could not create locale "sv": Success

# CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR: could not create locale "sv": Resource temporarily unavailable
Time: 1.109 ms

Hmm, that's pretty straightforward code. What is your operating system?
What are the build options? Does it work without this patch?

This issue is unrelated to ICU. I had forgot to specify provider so the
eorrs are correct (even though that the value of the errno is weird).

- For the collprovider is it really correct to say that 'd' is the
default value as it does in catalogs.sgml?

It doesn't say it's the default value, it says it uses the database
default. This is all a bit confusing. We have a collation object named
"default", which uses the locale set for the database. That's been that
way for a while. Now when introducing the collation providers, that
"default" collation gets its own collprovider category 'd'. That is not
really used anywhere, but we have to put something there.

Ah, I see now. Hm, that is a bit awkward but I cannot think of a cleaner
alternative.

- I do not know what the policy for formatting the documentation is, but
some of the paragraphs are in need of re-wrapping.

Hmm, I don't see anything terribly bad.

Maybe it is just me who is sensitive to wrapping, but her is an example
which sticks out to me with its 98 character line.

+   <para>
+    A collation object provided by <literal>libc</literal> maps to a 
combination
+    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol> 
settings.  (As
      the name would suggest, the main purpose of a collation is to set
      <symbol>LC_COLLATE</symbol>, which controls the sort order.  But
      it is rarely necessary in practice to have an
      <symbol>LC_CTYPE</symbol> setting that is different from
      <symbol>LC_COLLATE</symbol>, so it is more convenient to collect
      these under one concept than to create another infrastructure for
-    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a collation
+    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a 
<literal>libc</literal> collation
      is tied to a character set encoding (see <xref linkend="multibyte">).
      The same collation name may exist for different encodings.
     </para>

- Add a hint to "ERROR: conflicting or redundant options". The error
message is pretty unclear.

I don't see that in my patch. Example?

It is for the below. But after some thinking I am fine not fixing it.

# CREATE COLLATION svi (LOCALE = 'sv', LC_COLLATE = 'sv', PROVIDER = icu);
ERROR: conflicting or redundant options

- I am not a fan of this patch comparing the encoding with a -1 literal.
How about adding -1 as a value to the enum? See the example below for a
place where the patch compares with -1.

That's existing practice. Not a great practice, probably, but widespread.

Ok, if it is widespread in the code then let's keep using it. In a case
like this consistency is just as important.

- The patch adds "FIXME" in the below. Is that a left-over from
development or something which still needs doing?
/*
* Also forbid matching an any-encoding entry.  This test of course
is not
* backed up by the unique index, but it's not a problem since we don't
-    * support adding any-encoding entries after initdb.
+    * support adding any-encoding entries after initdb. FIXME
*/
I had mentioned that upthread. It technically needs "doing" as you say,
but it's not clear how and it's not terribly important, arguably.

The comment is no longer true since for ICU we can do that (it is not an
issue though). At the very least this comment needs to be updated.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#76

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Andreas Karlsson (#75)

1 attachment(s)

Re: ICU integration

On 3/19/17 18:31, Andreas Karlsson wrote:

Ok, I applied to patch again and ran the tests. I get a test failure on
make check-world in the pg_dump tests but it can be fixed with the below.

Fixed. This was a new pg_dump test that had been added in the meantime.

I believe you are mistaken. The locale "sv" is just an alias for
"sv-u-standard" as far as I can tell. See the definition of the Swedish
locale
(http://source.icu-project.org/repos/icu/trunk/icu4c/source/data/coll/sv.txt)
and there are just three collations: reformed (default), standard,
search (plus eot and emoji which are inherited).

I have filed bug <https://ssl.icu-project.org/trac/ticket/13050> about
this. I don't think we can work around this by simply ignoring the
"standard" variant. For example, the URL you show indicates that the
default variant for "sv" is "reformed", not "standard".

I am also quite annoyed at col-emoji and col-eor (European Ordering
Rules). They are defined at the root and inherited by all languages, but
no language modifies col-emoji for their needs which makes it a foot
gun. See the Danish sorting example below where at least I expected the
same order. For col-eor it makes a bit more sense since I would expect
the locales en_GB-u-col-eot and sv_SE-u-col-eot to collate exactly the same.

I think this is just the way the inheritance of locales works there. If
the root has a variant and the more specific locale doesn't, then if you
specify the variant, it gets pulled from the root. It is a bit dubious.
Arguably, either the customizations of the specific locale should be
applied to all variants, or ucol_getKeywordValuesForLocale() shouldn't
return them, or both. Perhaps another bug report.

- I get an error when creating a new locale.

#CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR: could not create locale "sv": Success

# CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR: could not create locale "sv": Resource temporarily unavailable
Time: 1.109 ms

Hmm, that's pretty straightforward code. What is your operating system?
What are the build options? Does it work without this patch?

This issue is unrelated to ICU. I had forgot to specify provider so the
eorrs are correct (even though that the value of the errno is weird).

Perhaps we can do something about this anyway. What is our OS and version?

- I do not know what the policy for formatting the documentation is, but
some of the paragraphs are in need of re-wrapping.

Hmm, I don't see anything terribly bad.

Maybe it is just me who is sensitive to wrapping, but her is an example
which sticks out to me with its 98 character line.

I see. Fixed some of those.

- The patch adds "FIXME" in the below. Is that a left-over from
development or something which still needs doing?
/*
* Also forbid matching an any-encoding entry.  This test of course
is not
* backed up by the unique index, but it's not a problem since we don't
-    * support adding any-encoding entries after initdb.
+    * support adding any-encoding entries after initdb. FIXME
*/
I had mentioned that upthread. It technically needs "doing" as you say,
but it's not clear how and it's not terribly important, arguably.
The comment is no longer true since for ICU we can do that (it is not an
issue though). At the very least this comment needs to be updated.

Fixed that by some additional checks and using ShareRowExclusiveLock in
CREATE COLLATION to prevent concurrent changes.

New patch attached.

I have also implemented additional facilities to check collations
needing refresh after an upgrade (see ALTER COLLATION ref page), and
some necessary bits for that in pg_dump. Some additional support in
pg_upgrade could be useful, but I think we don't need that until people
are upgrading *from* PG10.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v7-0001-ICU-support.patchapplication/x-patch; name=v7-0001-ICU-support.patchDownload

From b66c058aaf29149ac11e0025dd78e337d6415ca5 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Thu, 16 Feb 2017 00:06:16 -0500
Subject: [PATCH v7] ICU support

Add a column collprovider to pg_collation that determines which library
provides the collation data.  The existing choices are default and libc,
and this adds an icu choice, which uses the ICU4C library.

The pg_locale_t type is changed to a union that contains the
provider-specific locale handles.  Users of locale information are
changed to look into that struct for the appropriate handle to use.

Also add a collversion column that records the version of the collation
when it is created, and check at run time whether it is still the same.
This detects potentially incompatible library upgrades that can corrupt
indexes and other structures.  This is currently only supported by
ICU-provided collations.

initdb initializes the default collation set as before from the `locale
-a` output but also adds all available ICU locales with a "-x-icu"
appended.

Currently, ICU-provided collations can only be explicitly named
collations.  The global database locales are still always libc-provided.
---
 aclocal.m4                                         |   1 +
 config/pkg.m4                                      | 275 +++++++++++++
 configure                                          | 313 ++++++++++++++
 configure.in                                       |  35 ++
 doc/src/sgml/catalogs.sgml                         |  19 +
 doc/src/sgml/charset.sgml                          | 177 +++++++-
 doc/src/sgml/func.sgml                             |  17 +
 doc/src/sgml/installation.sgml                     |  14 +
 doc/src/sgml/mvcc.sgml                             |   3 +-
 doc/src/sgml/ref/alter_collation.sgml              |  55 +++
 doc/src/sgml/ref/create_collation.sgml             |  37 +-
 src/Makefile.global.in                             |   4 +
 src/backend/Makefile                               |   2 +-
 src/backend/catalog/pg_collation.c                 |  52 ++-
 src/backend/commands/collationcmds.c               | 288 ++++++++++++-
 src/backend/common.mk                              |   2 +
 src/backend/nodes/copyfuncs.c                      |  13 +
 src/backend/nodes/equalfuncs.c                     |  11 +
 src/backend/parser/gram.y                          |  18 +-
 src/backend/regex/regc_pg_locale.c                 | 110 +++--
 src/backend/tcop/utility.c                         |   8 +
 src/backend/utils/adt/formatting.c                 | 453 +++++++++++----------
 src/backend/utils/adt/like.c                       |  53 ++-
 src/backend/utils/adt/pg_locale.c                  | 266 ++++++++++--
 src/backend/utils/adt/selfuncs.c                   |   8 +-
 src/backend/utils/adt/varlena.c                    | 179 ++++++--
 src/backend/utils/mb/encnames.c                    |  76 ++++
 src/bin/initdb/initdb.c                            |   3 +-
 src/bin/pg_dump/pg_dump.c                          |  75 +++-
 src/bin/pg_dump/t/002_pg_dump.pl                   |   2 +-
 src/bin/psql/describe.c                            |   7 +-
 src/include/catalog/pg_collation.h                 |  25 +-
 src/include/catalog/pg_collation_fn.h              |   2 +
 src/include/catalog/pg_proc.h                      |   3 +
 src/include/commands/collationcmds.h               |   1 +
 src/include/mb/pg_wchar.h                          |   6 +
 src/include/nodes/nodes.h                          |   1 +
 src/include/nodes/parsenodes.h                     |  11 +
 src/include/pg_config.h.in                         |   6 +
 src/include/utils/pg_locale.h                      |  32 +-
 src/test/regress/GNUmakefile                       |   3 +
 .../{collate.linux.utf8.out => collate.icu.out}    | 204 ++++++----
 src/test/regress/expected/collate.linux.utf8.out   |  80 +++-
 .../{collate.linux.utf8.sql => collate.icu.sql}    | 105 +++--
 src/test/regress/sql/collate.linux.utf8.sql        |  32 ++
 45 files changed, 2563 insertions(+), 524 deletions(-)
 create mode 100644 config/pkg.m4
 copy src/test/regress/expected/{collate.linux.utf8.out => collate.icu.out} (80%)
 copy src/test/regress/sql/{collate.linux.utf8.sql => collate.icu.sql} (80%)

diff --git a/aclocal.m4 b/aclocal.m4
index 6f930b6fc1..5ca902b6a2 100644
--- a/aclocal.m4
+++ b/aclocal.m4
@@ -7,6 +7,7 @@ m4_include([config/docbook.m4])
 m4_include([config/general.m4])
 m4_include([config/libtool.m4])
 m4_include([config/perl.m4])
+m4_include([config/pkg.m4])
 m4_include([config/programs.m4])
 m4_include([config/python.m4])
 m4_include([config/tcl.m4])
diff --git a/config/pkg.m4 b/config/pkg.m4
new file mode 100644
index 0000000000..82bea96ee7
--- /dev/null
+++ b/config/pkg.m4
@@ -0,0 +1,275 @@
+dnl pkg.m4 - Macros to locate and utilise pkg-config.   -*- Autoconf -*-
+dnl serial 11 (pkg-config-0.29.1)
+dnl
+dnl Copyright © 2004 Scott James Remnant <scott@netsplit.com>.
+dnl Copyright © 2012-2015 Dan Nicholson <dbn.lists@gmail.com>
+dnl
+dnl This program is free software; you can redistribute it and/or modify
+dnl it under the terms of the GNU General Public License as published by
+dnl the Free Software Foundation; either version 2 of the License, or
+dnl (at your option) any later version.
+dnl
+dnl This program is distributed in the hope that it will be useful, but
+dnl WITHOUT ANY WARRANTY; without even the implied warranty of
+dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+dnl General Public License for more details.
+dnl
+dnl You should have received a copy of the GNU General Public License
+dnl along with this program; if not, write to the Free Software
+dnl Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+dnl 02111-1307, USA.
+dnl
+dnl As a special exception to the GNU General Public License, if you
+dnl distribute this file as part of a program that contains a
+dnl configuration script generated by Autoconf, you may include it under
+dnl the same distribution terms that you use for the rest of that
+dnl program.
+
+dnl PKG_PREREQ(MIN-VERSION)
+dnl -----------------------
+dnl Since: 0.29
+dnl
+dnl Verify that the version of the pkg-config macros are at least
+dnl MIN-VERSION. Unlike PKG_PROG_PKG_CONFIG, which checks the user's
+dnl installed version of pkg-config, this checks the developer's version
+dnl of pkg.m4 when generating configure.
+dnl
+dnl To ensure that this macro is defined, also add:
+dnl m4_ifndef([PKG_PREREQ],
+dnl     [m4_fatal([must install pkg-config 0.29 or later before running autoconf/autogen])])
+dnl
+dnl See the "Since" comment for each macro you use to see what version
+dnl of the macros you require.
+m4_defun([PKG_PREREQ],
+[m4_define([PKG_MACROS_VERSION], [0.29.1])
+m4_if(m4_version_compare(PKG_MACROS_VERSION, [$1]), -1,
+    [m4_fatal([pkg.m4 version $1 or higher is required but ]PKG_MACROS_VERSION[ found])])
+])dnl PKG_PREREQ
+
+dnl PKG_PROG_PKG_CONFIG([MIN-VERSION])
+dnl ----------------------------------
+dnl Since: 0.16
+dnl
+dnl Search for the pkg-config tool and set the PKG_CONFIG variable to
+dnl first found in the path. Checks that the version of pkg-config found
+dnl is at least MIN-VERSION. If MIN-VERSION is not specified, 0.9.0 is
+dnl used since that's the first version where most current features of
+dnl pkg-config existed.
+AC_DEFUN([PKG_PROG_PKG_CONFIG],
+[m4_pattern_forbid([^_?PKG_[A-Z_]+$])
+m4_pattern_allow([^PKG_CONFIG(_(PATH|LIBDIR|SYSROOT_DIR|ALLOW_SYSTEM_(CFLAGS|LIBS)))?$])
+m4_pattern_allow([^PKG_CONFIG_(DISABLE_UNINSTALLED|TOP_BUILD_DIR|DEBUG_SPEW)$])
+AC_ARG_VAR([PKG_CONFIG], [path to pkg-config utility])
+AC_ARG_VAR([PKG_CONFIG_PATH], [directories to add to pkg-config's search path])
+AC_ARG_VAR([PKG_CONFIG_LIBDIR], [path overriding pkg-config's built-in search path])
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	AC_PATH_TOOL([PKG_CONFIG], [pkg-config])
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=m4_default([$1], [0.9.0])
+	AC_MSG_CHECKING([pkg-config is at least version $_pkg_min_version])
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		AC_MSG_RESULT([yes])
+	else
+		AC_MSG_RESULT([no])
+		PKG_CONFIG=""
+	fi
+fi[]dnl
+])dnl PKG_PROG_PKG_CONFIG
+
+dnl PKG_CHECK_EXISTS(MODULES, [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------------------------------
+dnl Since: 0.18
+dnl
+dnl Check to see whether a particular set of modules exists. Similar to
+dnl PKG_CHECK_MODULES(), but does not set variables or print errors.
+dnl
+dnl Please remember that m4 expands AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+dnl only at the first occurence in configure.ac, so if the first place
+dnl it's called might be skipped (such as if it is within an "if", you
+dnl have to call PKG_CHECK_EXISTS manually
+AC_DEFUN([PKG_CHECK_EXISTS],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+if test -n "$PKG_CONFIG" && \
+    AC_RUN_LOG([$PKG_CONFIG --exists --print-errors "$1"]); then
+  m4_default([$2], [:])
+m4_ifvaln([$3], [else
+  $3])dnl
+fi])
+
+dnl _PKG_CONFIG([VARIABLE], [COMMAND], [MODULES])
+dnl ---------------------------------------------
+dnl Internal wrapper calling pkg-config via PKG_CONFIG and setting
+dnl pkg_failed based on the result.
+m4_define([_PKG_CONFIG],
+[if test -n "$$1"; then
+    pkg_cv_[]$1="$$1"
+ elif test -n "$PKG_CONFIG"; then
+    PKG_CHECK_EXISTS([$3],
+                     [pkg_cv_[]$1=`$PKG_CONFIG --[]$2 "$3" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes ],
+		     [pkg_failed=yes])
+ else
+    pkg_failed=untried
+fi[]dnl
+])dnl _PKG_CONFIG
+
+dnl _PKG_SHORT_ERRORS_SUPPORTED
+dnl ---------------------------
+dnl Internal check to see if pkg-config supports short errors.
+AC_DEFUN([_PKG_SHORT_ERRORS_SUPPORTED],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi[]dnl
+])dnl _PKG_SHORT_ERRORS_SUPPORTED
+
+
+dnl PKG_CHECK_MODULES(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl --------------------------------------------------------------
+dnl Since: 0.4.0
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES might not happen, you should be sure to include an
+dnl explicit call to PKG_PROG_PKG_CONFIG in your configure.ac
+AC_DEFUN([PKG_CHECK_MODULES],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1][_CFLAGS], [C compiler flags for $1, overriding pkg-config])dnl
+AC_ARG_VAR([$1][_LIBS], [linker flags for $1, overriding pkg-config])dnl
+
+pkg_failed=no
+AC_MSG_CHECKING([for $1])
+
+_PKG_CONFIG([$1][_CFLAGS], [cflags], [$2])
+_PKG_CONFIG([$1][_LIBS], [libs], [$2])
+
+m4_define([_PKG_TEXT], [Alternatively, you may set the environment variables $1[]_CFLAGS
+and $1[]_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.])
+
+if test $pkg_failed = yes; then
+   	AC_MSG_RESULT([no])
+        _PKG_SHORT_ERRORS_SUPPORTED
+        if test $_pkg_short_errors_supported = yes; then
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "$2" 2>&1`
+        else 
+	        $1[]_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "$2" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$$1[]_PKG_ERRORS" >&AS_MESSAGE_LOG_FD
+
+	m4_default([$4], [AC_MSG_ERROR(
+[Package requirements ($2) were not met:
+
+$$1_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+_PKG_TEXT])[]dnl
+        ])
+elif test $pkg_failed = untried; then
+     	AC_MSG_RESULT([no])
+	m4_default([$4], [AC_MSG_FAILURE(
+[The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+_PKG_TEXT
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.])[]dnl
+        ])
+else
+	$1[]_CFLAGS=$pkg_cv_[]$1[]_CFLAGS
+	$1[]_LIBS=$pkg_cv_[]$1[]_LIBS
+        AC_MSG_RESULT([yes])
+	$3
+fi[]dnl
+])dnl PKG_CHECK_MODULES
+
+
+dnl PKG_CHECK_MODULES_STATIC(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND],
+dnl   [ACTION-IF-NOT-FOUND])
+dnl ---------------------------------------------------------------------
+dnl Since: 0.29
+dnl
+dnl Checks for existence of MODULES and gathers its build flags with
+dnl static libraries enabled. Sets VARIABLE-PREFIX_CFLAGS from --cflags
+dnl and VARIABLE-PREFIX_LIBS from --libs.
+dnl
+dnl Note that if there is a possibility the first call to
+dnl PKG_CHECK_MODULES_STATIC might not happen, you should be sure to
+dnl include an explicit call to PKG_PROG_PKG_CONFIG in your
+dnl configure.ac.
+AC_DEFUN([PKG_CHECK_MODULES_STATIC],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+_save_PKG_CONFIG=$PKG_CONFIG
+PKG_CONFIG="$PKG_CONFIG --static"
+PKG_CHECK_MODULES($@)
+PKG_CONFIG=$_save_PKG_CONFIG[]dnl
+])dnl PKG_CHECK_MODULES_STATIC
+
+
+dnl PKG_INSTALLDIR([DIRECTORY])
+dnl -------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable pkgconfigdir as the location where a module
+dnl should install pkg-config .pc files. By default the directory is
+dnl $libdir/pkgconfig, but the default can be changed by passing
+dnl DIRECTORY. The user can override through the --with-pkgconfigdir
+dnl parameter.
+AC_DEFUN([PKG_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${libdir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([pkgconfigdir],
+    [AS_HELP_STRING([--with-pkgconfigdir], pkg_description)],,
+    [with_pkgconfigdir=]pkg_default)
+AC_SUBST([pkgconfigdir], [$with_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_INSTALLDIR
+
+
+dnl PKG_NOARCH_INSTALLDIR([DIRECTORY])
+dnl --------------------------------
+dnl Since: 0.27
+dnl
+dnl Substitutes the variable noarch_pkgconfigdir as the location where a
+dnl module should install arch-independent pkg-config .pc files. By
+dnl default the directory is $datadir/pkgconfig, but the default can be
+dnl changed by passing DIRECTORY. The user can override through the
+dnl --with-noarch-pkgconfigdir parameter.
+AC_DEFUN([PKG_NOARCH_INSTALLDIR],
+[m4_pushdef([pkg_default], [m4_default([$1], ['${datadir}/pkgconfig'])])
+m4_pushdef([pkg_description],
+    [pkg-config arch-independent installation directory @<:@]pkg_default[@:>@])
+AC_ARG_WITH([noarch-pkgconfigdir],
+    [AS_HELP_STRING([--with-noarch-pkgconfigdir], pkg_description)],,
+    [with_noarch_pkgconfigdir=]pkg_default)
+AC_SUBST([noarch_pkgconfigdir], [$with_noarch_pkgconfigdir])
+m4_popdef([pkg_default])
+m4_popdef([pkg_description])
+])dnl PKG_NOARCH_INSTALLDIR
+
+
+dnl PKG_CHECK_VAR(VARIABLE, MODULE, CONFIG-VARIABLE,
+dnl [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
+dnl -------------------------------------------
+dnl Since: 0.28
+dnl
+dnl Retrieves the value of the pkg-config variable for the given module.
+AC_DEFUN([PKG_CHECK_VAR],
+[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl
+AC_ARG_VAR([$1], [value of $3 for $2, overriding pkg-config])dnl
+
+_PKG_CONFIG([$1], [variable="][$3]["], [$2])
+AS_VAR_COPY([$1], [pkg_cv_][$1])
+
+AS_VAR_IF([$1], [""], [$5], [$4])dnl
+])dnl PKG_CHECK_VAR
diff --git a/configure b/configure
index b5cdebb510..481d5d15e8 100755
--- a/configure
+++ b/configure
@@ -715,6 +715,12 @@ krb_srvtab
 with_python
 with_perl
 with_tcl
+ICU_LIBS
+ICU_CFLAGS
+PKG_CONFIG_LIBDIR
+PKG_CONFIG_PATH
+PKG_CONFIG
+with_icu
 enable_thread_safety
 INCLUDES
 autodepend
@@ -821,6 +827,7 @@ with_CC
 enable_depend
 enable_cassert
 enable_thread_safety
+with_icu
 with_tcl
 with_tclconfig
 with_perl
@@ -856,6 +863,11 @@ LDFLAGS
 LIBS
 CPPFLAGS
 CPP
+PKG_CONFIG
+PKG_CONFIG_PATH
+PKG_CONFIG_LIBDIR
+ICU_CFLAGS
+ICU_LIBS
 LDFLAGS_EX
 LDFLAGS_SL
 DOCBOOKSTYLE'
@@ -1511,6 +1523,7 @@ Optional Packages:
   --with-wal-segsize=SEGSIZE
                           set WAL segment size in MB [16]
   --with-CC=CMD           set compiler (deprecated)
+  --with-icu              build with ICU support
   --with-tcl              build Tcl modules (PL/Tcl)
   --with-tclconfig=DIR    tclConfig.sh is in DIR
   --with-perl             build Perl modules (PL/Perl)
@@ -1546,6 +1559,13 @@ Some influential environment variables:
   CPPFLAGS    (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if
               you have headers in a nonstandard directory <include dir>
   CPP         C preprocessor
+  PKG_CONFIG  path to pkg-config utility
+  PKG_CONFIG_PATH
+              directories to add to pkg-config's search path
+  PKG_CONFIG_LIBDIR
+              path overriding pkg-config's built-in search path
+  ICU_CFLAGS  C compiler flags for ICU, overriding pkg-config
+  ICU_LIBS    linker flags for ICU, overriding pkg-config
   LDFLAGS_EX  extra linker flags for linking executables only
   LDFLAGS_SL  extra linker flags for linking shared libraries only
   DOCBOOKSTYLE
@@ -5362,6 +5382,255 @@ $as_echo "$enable_thread_safety" >&6; }
 
 
 #
+# ICU
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with ICU support" >&5
+$as_echo_n "checking whether to build with ICU support... " >&6; }
+
+
+
+# Check whether --with-icu was given.
+if test "${with_icu+set}" = set; then :
+  withval=$with_icu;
+  case $withval in
+    yes)
+
+$as_echo "#define USE_ICU 1" >>confdefs.h
+
+      ;;
+    no)
+      :
+      ;;
+    *)
+      as_fn_error $? "no argument expected for --with-icu option" "$LINENO" 5
+      ;;
+  esac
+
+else
+  with_icu=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_icu" >&5
+$as_echo "$with_icu" >&6; }
+
+
+if test "$with_icu" = yes; then
+
+
+
+
+
+
+
+if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
+	if test -n "$ac_tool_prefix"; then
+  # Extract the first word of "${ac_tool_prefix}pkg-config", so it can be a program name with args.
+set dummy ${ac_tool_prefix}pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_PKG_CONFIG="$PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+PKG_CONFIG=$ac_cv_path_PKG_CONFIG
+if test -n "$PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $PKG_CONFIG" >&5
+$as_echo "$PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+fi
+if test -z "$ac_cv_path_PKG_CONFIG"; then
+  ac_pt_PKG_CONFIG=$PKG_CONFIG
+  # Extract the first word of "pkg-config", so it can be a program name with args.
+set dummy pkg-config; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_ac_pt_PKG_CONFIG+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $ac_pt_PKG_CONFIG in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_ac_pt_PKG_CONFIG="$ac_pt_PKG_CONFIG" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+    ac_cv_path_ac_pt_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
+    $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+    break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+ac_pt_PKG_CONFIG=$ac_cv_path_ac_pt_PKG_CONFIG
+if test -n "$ac_pt_PKG_CONFIG"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_pt_PKG_CONFIG" >&5
+$as_echo "$ac_pt_PKG_CONFIG" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+  if test "x$ac_pt_PKG_CONFIG" = x; then
+    PKG_CONFIG=""
+  else
+    case $cross_compiling:$ac_tool_warned in
+yes:)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5
+$as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;}
+ac_tool_warned=yes ;;
+esac
+    PKG_CONFIG=$ac_pt_PKG_CONFIG
+  fi
+else
+  PKG_CONFIG="$ac_cv_path_PKG_CONFIG"
+fi
+
+fi
+if test -n "$PKG_CONFIG"; then
+	_pkg_min_version=0.9.0
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: checking pkg-config is at least version $_pkg_min_version" >&5
+$as_echo_n "checking pkg-config is at least version $_pkg_min_version... " >&6; }
+	if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+	else
+		{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+		PKG_CONFIG=""
+	fi
+fi
+
+pkg_failed=no
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for ICU" >&5
+$as_echo_n "checking for ICU... " >&6; }
+
+if test -n "$ICU_CFLAGS"; then
+    pkg_cv_ICU_CFLAGS="$ICU_CFLAGS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_CFLAGS=`$PKG_CONFIG --cflags "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+if test -n "$ICU_LIBS"; then
+    pkg_cv_ICU_LIBS="$ICU_LIBS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"icu-uc icu-i18n\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "icu-uc icu-i18n") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_ICU_LIBS=`$PKG_CONFIG --libs "icu-uc icu-i18n" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+
+
+
+if test $pkg_failed = yes; then
+   	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi
+        if test $_pkg_short_errors_supported = yes; then
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        else
+	        ICU_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "icu-uc icu-i18n" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$ICU_PKG_ERRORS" >&5
+
+	as_fn_error $? "Package requirements (icu-uc icu-i18n) were not met:
+
+$ICU_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details." "$LINENO" 5
+elif test $pkg_failed = untried; then
+     	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+	{ { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error $? "The pkg-config script could not be found or is too old.  Make sure it
+is in your PATH or set the PKG_CONFIG environment variable to the full
+path to pkg-config.
+
+Alternatively, you may set the environment variables ICU_CFLAGS
+and ICU_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details.
+
+To get pkg-config, see <http://pkg-config.freedesktop.org/>.
+See \`config.log' for more details" "$LINENO" 5; }
+else
+	ICU_CFLAGS=$pkg_cv_ICU_CFLAGS
+	ICU_LIBS=$pkg_cv_ICU_LIBS
+        { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+
+fi
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with Tcl" >&5
@@ -13433,6 +13702,50 @@ fi
 done
 
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ucol_strcollUTF8" >&5
+$as_echo_n "checking for ucol_strcollUTF8... " >&6; }
+if ${pgac_cv_func_ucol_strcollUTF8+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <unicode/ucol.h>
+
+int
+main ()
+{
+ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv_func_ucol_strcollUTF8=yes
+else
+  pgac_cv_func_ucol_strcollUTF8=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_func_ucol_strcollUTF8" >&5
+$as_echo "$pgac_cv_func_ucol_strcollUTF8" >&6; }
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+
+$as_echo "#define HAVE_UCOL_STRCOLLUTF8 1" >>confdefs.h
+
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/configure.in b/configure.in
index 1d99cda1d8..235b8745d6 100644
--- a/configure.in
+++ b/configure.in
@@ -614,6 +614,19 @@ AC_MSG_RESULT([$enable_thread_safety])
 AC_SUBST(enable_thread_safety)
 
 #
+# ICU
+#
+AC_MSG_CHECKING([whether to build with ICU support])
+PGAC_ARG_BOOL(with, icu, no, [build with ICU support],
+              [AC_DEFINE([USE_ICU], 1, [Define to build with ICU support. (--with-icu)])])
+AC_MSG_RESULT([$with_icu])
+AC_SUBST(with_icu)
+
+if test "$with_icu" = yes; then
+  PKG_CHECK_MODULES(ICU, icu-uc icu-i18n)
+fi
+
+#
 # Optionally build Tcl modules (PL/Tcl)
 #
 AC_MSG_CHECKING([whether to build with Tcl])
@@ -1634,6 +1647,28 @@ fi
 AC_CHECK_FUNCS([strtoll strtoq], [break])
 AC_CHECK_FUNCS([strtoull strtouq], [break])
 
+if test "$with_icu" = yes; then
+  # ICU functions are macros, so we need to do this the long way.
+
+  # ucol_strcollUTF8() appeared in ICU 50.
+  AC_CACHE_CHECK([for ucol_strcollUTF8], [pgac_cv_func_ucol_strcollUTF8],
+[ac_save_CPPFLAGS=$CPPFLAGS
+CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
+ac_save_LIBS=$LIBS
+LIBS="$ICU_LIBS $LIBS"
+AC_LINK_IFELSE([AC_LANG_PROGRAM(
+[#include <unicode/ucol.h>
+],
+[ucol_strcollUTF8(NULL, NULL, 0, NULL, 0, NULL);])],
+[pgac_cv_func_ucol_strcollUTF8=yes],
+[pgac_cv_func_ucol_strcollUTF8=no])
+CPPFLAGS=$ac_save_CPPFLAGS
+LIBS=$ac_save_LIBS])
+  if test "$pgac_cv_func_ucol_strcollUTF8" = yes ; then
+    AC_DEFINE([HAVE_UCOL_STRCOLLUTF8], 1, [Define to 1 if you have the `ucol_strcollUTF8' function.])
+  fi
+fi
+
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index df0435c3f0..8e87a18e9a 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2016,6 +2016,14 @@ <title><structname>pg_collation</> Columns</title>
      </row>
 
      <row>
+      <entry><structfield>collprovider</structfield></entry>
+      <entry><type>char</type></entry>
+      <entry></entry>
+      <entry>Provider of the collation: <literal>d</literal> = database
+       default, <literal>c</literal> = libc, <literal>i</literal> = icu</entry>
+     </row>
+
+     <row>
       <entry><structfield>collencoding</structfield></entry>
       <entry><type>int4</type></entry>
       <entry></entry>
@@ -2036,6 +2044,17 @@ <title><structname>pg_collation</> Columns</title>
       <entry></entry>
       <entry><symbol>LC_CTYPE</> for this collation object</entry>
      </row>
+
+     <row>
+      <entry><structfield>collversion</structfield></entry>
+      <entry><type>text</type></entry>
+      <entry></entry>
+      <entry>
+       Provider-specific version of the collation.  This is recorded when the
+       collation is created and then checked when it is used, to detect
+       changes in the collation definition that could lead to data corruption.
+      </entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 2aba0fc528..de2a798468 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -500,21 +500,47 @@ <title>Concepts</title>
    <title>Managing Collations</title>
 
    <para>
-    A collation is an SQL schema object that maps an SQL name to
-    operating system locales.  In particular, it maps to a combination
-    of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>.  (As
+    A collation is an SQL schema object that maps an SQL name to locales
+    provided by libraries installed in the operating system.  A collation
+    definition has a <firstterm>provider</firstterm> that specifies which
+    library supplies the locale data.  One standard provider name
+    is <literal>libc</literal>, which uses the locales provided by the
+    operating system C library.  These are the locales that most tools
+    provided by the operating system use.  Another provider
+    is <literal>icu</literal>, which uses the external
+    ICU<indexterm><primary>ICU</></> library.  Support for ICU has to be
+    configured when PostgreSQL is built.
+   </para>
+
+   <para>
+    A collation object provided by <literal>libc</literal> maps to a
+    combination of <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>
+    settings.  (As
     the name would suggest, the main purpose of a collation is to set
     <symbol>LC_COLLATE</symbol>, which controls the sort order.  But
     it is rarely necessary in practice to have an
     <symbol>LC_CTYPE</symbol> setting that is different from
     <symbol>LC_COLLATE</symbol>, so it is more convenient to collect
     these under one concept than to create another infrastructure for
-    setting <symbol>LC_CTYPE</symbol> per expression.)  Also, a collation
+    setting <symbol>LC_CTYPE</symbol> per expression.)  Also,
+    a <literal>libc</literal> collation
     is tied to a character set encoding (see <xref linkend="multibyte">).
     The same collation name may exist for different encodings.
    </para>
 
    <para>
+    A collation provided by <literal>icu</literal> maps to a named collator
+    provided by the ICU library.  ICU does not support
+    separate <quote>collate</quote> and <quote>ctype</quote> settings, so they
+    are always the same.  Also, ICU collations are independent of the
+    encoding, so there is always only one ICU collation for a given name in a
+    database.
+   </para>
+
+   <sect3>
+    <title>Standard Collations</title>
+
+   <para>
     On all platforms, the collations named <literal>default</>,
     <literal>C</>, and <literal>POSIX</> are available.  Additional
     collations may be available depending on operating system support.
@@ -528,12 +554,36 @@ <title>Managing Collations</title>
    </para>
 
    <para>
+    Additionally, the SQL standard collation name <literal>ucs_basic</literal>
+    is available for encoding <literal>UTF8</literal>.  It is equivalent
+    to <literal>C</literal> and sorts by Unicode code point.
+   </para>
+  </sect3>
+
+  <sect3>
+   <title>Predefined Collations</title>
+
+   <para>
     If the operating system provides support for using multiple locales
     within a single program (<function>newlocale</> and related functions),
+    or support for ICU is configured,
     then when a database cluster is initialized, <command>initdb</command>
     populates the system catalog <literal>pg_collation</literal> with
     collations based on all the locales it finds on the operating
-    system at the time.  For example, the operating system might
+    system at the time.
+   </para>
+
+   <para>
+    The inspect the currently available locales, use the query <literal>SELECT
+    * FROM pg_collation</literal>, or the command <command>\dOS+</command>
+    in <application>psql</application>.
+   </para>
+
+  <sect4>
+   <title>libc collations</title>
+
+   <para>
+    For example, the operating system might
     provide a locale named <literal>de_DE.utf8</literal>.
     <command>initdb</command> would then create a collation named
     <literal>de_DE.utf8</literal> for encoding <literal>UTF8</literal>
@@ -548,13 +598,14 @@ <title>Managing Collations</title>
    </para>
 
    <para>
-    In case a collation is needed that has different values for
-    <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>, a new
-    collation may be created using
-    the <xref linkend="sql-createcollation"> command.  That command
-    can also be used to create a new collation from an existing
-    collation, which can be useful to be able to use
-    operating-system-independent collation names in applications.
+    The default set of collations provided by <literal>libc</literal> map
+    directly to the locales installed in the operating system, which can be
+    listed using the command <literal>locale -a</literal>.  In case
+    a <literal>libc</literal> collation is needed that has different values
+    for <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol>, or new
+    locales are installed in the operating system after the database system
+    was initialized, then a new collation may be created using
+    the <xref linkend="sql-createcollation"> command.
    </para>
 
    <para>
@@ -566,8 +617,8 @@ <title>Managing Collations</title>
     Use of the stripped collation names is recommended, since it will
     make one less thing you need to change if you decide to change to
     another database encoding.  Note however that the <literal>default</>,
-    <literal>C</>, and <literal>POSIX</> collations can be used
-    regardless of the database encoding.
+    <literal>C</>, and <literal>POSIX</> collations, as well as all collations
+    provided by ICU can be used regardless of the database encoding.
    </para>
 
    <para>
@@ -581,6 +632,104 @@ <title>Managing Collations</title>
     collations have identical behaviors.  Mixing stripped and non-stripped
     collation names is therefore not recommended.
    </para>
+  </sect4>
+
+  <sect4>
+   <title>ICU collations</title>
+
+   <para>
+    Collations provided by ICU are created with names in BCP 47 language tag
+    format, with a <quote>private use</quote>
+    extension <literal>-x-icu</literal> appended, to distinguish them from
+    libc locales.  So <literal>de-x-icu</literal> would be an example.
+   </para>
+
+   <para>
+    With ICU, it is not sensible to enumerate all possible locale names.  ICU
+    uses a particular naming system for locales, but there are many more ways
+    to name a locale than there are actually distinct locales.  (In fact, any
+    string will be accepted as a locale name.)
+    See <ulink url="http://userguide.icu-project.org/locale"></ulink> for
+    information on ICU locale naming.  <command>initdb</command> uses the ICU
+    APIs to extract a set of locales with distinct collation rules to populate
+    the initial set of collations.  Here are some examples collations that
+    might be created:
+
+    <variablelist>
+     <varlistentry>
+      <term><literal>de-x-icu</literal></term>
+      <listitem>
+       <para>German collation, default variant</para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de-u-co-phonebk-x-icu</literal></term>
+      <listitem>
+       <para>German collation, phone book variant</para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de-AT-x-icu</literal></term>
+      <listitem>
+       <para>German collation for Austria, default variant</para>
+       <para>
+        (Note that as of this writing, there is no,
+        say, <literal>de-DE-x-icu</literal> or <literal>de-CH-x-icu</literal>,
+        because those are equivalent to <literal>de-x-icu</literal>.)
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
+      <term><literal>de-AT-u-co-phonebk-x-icu</literal></term>
+      <listitem>
+       <para>German collation for Austria, phone book variant</para>
+      </listitem>
+     </varlistentry>
+     <varlistentry>
+      <term><literal>und-x-icu</literal> (for <quote>undefined</quote>)</term>
+      <listitem>
+       <para>
+        ICU <quote>root</quote> collation.  Use this to get a reasonable
+        language-agnostic sort order.
+       </para>
+      </listitem>
+     </varlistentry>
+    </variablelist>
+   </para>
+
+   <para>
+    Some (less frequently used) encodings are not supported by ICU.  If the
+    database cluster was initialized with such an encoding, no ICU collations
+    will be predefined.
+   </para>
+   </sect4>
+   </sect3>
+
+   <sect3>
+   <title>Copying Collations</title>
+
+   <para>
+    The command <xref linkend="sql-createcollation"> can also be used to
+    create a new collation from an existing collation, which can be useful to
+    be able to use operating-system-independent collation names in
+    applications, create compatibility names, or use an ICU-provided collation
+    under a more readable name.  For example:
+<programlisting>
+CREATE COLLATION german FROM "de_DE";
+CREATE COLLATION french FROM "fr-x-icu";
+CREATE COLLATION "de-DE-x-icu" FROM "de-x-icu";
+</programlisting>
+   </para>
+
+   <para>
+    The standard and predefined collations are in the
+    schema <literal>pg_catalog</literal>, like all predefined objects.
+    User-defined collations should be created in user schemas.  This also
+    ensures that they are saved by <command>pg_dump</command>.
+   </para>
   </sect2>
  </sect1>
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9408a255dc..b652539ece 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -19541,6 +19541,14 @@ <title>Collation Management Functions</title>
      <tbody>
       <row>
        <entry>
+        <indexterm><primary>pg_collation_actual_version</primary></indexterm>
+        <literal><function>pg_collation_actual_version(<type>oid</>)</function></literal>
+       </entry>
+       <entry><type>text</type></entry>
+       <entry>Return actual version of collation from operating system</entry>
+      </row>
+      <row>
+       <entry>
         <indexterm><primary>pg_import_system_collations</primary></indexterm>
         <literal><function>pg_import_system_collations(<parameter>if_not_exists</> <type>boolean</>, <parameter>schema</> <type>regnamespace</>)</function></literal>
        </entry>
@@ -19552,6 +19560,15 @@ <title>Collation Management Functions</title>
    </table>
 
    <para>
+    <function>pg_collation_actual_version</function> returns the actual
+    version of the collation object as it is currently installed in the
+    operating system.  If this is different from the value
+    in <literal>pg_collation.collversion</literal>, then objects depending on
+    the collation might need to be rebuilt.  See also
+    <xref linkend="sql-altercollation">.
+   </para>
+
+   <para>
     <function>pg_import_system_collations</> populates the system
     catalog <literal>pg_collation</literal> with collations based on all the
     locales it finds on the operating system.  This is
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 79201b78e3..bb3778f02e 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -767,6 +767,20 @@ <title>Configuration</title>
       </varlistentry>
 
       <varlistentry>
+       <term><option>--with-icu</option></term>
+       <listitem>
+        <para>
+         Build with support for
+         the <productname>ICU</productname><indexterm><primary>ICU</></>
+         library.  This requires the <productname>ICU4C</productname> package
+         as well
+         as <productname>pkg-config</productname><indexterm><primary>pkg-config</></>
+         to be installed.
+        </para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
        <term><option>--with-openssl</option>
        <indexterm>
         <primary>OpenSSL</primary>
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 306def4a15..82e69fe2d2 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -967,7 +967,8 @@ <title>Table-level Lock Modes</title>
         </para>
 
         <para>
-         Acquired by <command>CREATE TRIGGER</command> and many forms of
+         Acquired by <command>CREATE COLLATION</command>,
+         <command>CREATE TRIGGER</command>, and many forms of
          <command>ALTER TABLE</command> (see <xref linkend="SQL-ALTERTABLE">).
         </para>
        </listitem>
diff --git a/doc/src/sgml/ref/alter_collation.sgml b/doc/src/sgml/ref/alter_collation.sgml
index 6708c7e10e..bf934ce75f 100644
--- a/doc/src/sgml/ref/alter_collation.sgml
+++ b/doc/src/sgml/ref/alter_collation.sgml
@@ -21,6 +21,8 @@
 
  <refsynopsisdiv>
 <synopsis>
+ALTER COLLATION <replaceable>name</replaceable> REFRESH VERSION
+
 ALTER COLLATION <replaceable>name</replaceable> RENAME TO <replaceable>new_name</replaceable>
 ALTER COLLATION <replaceable>name</replaceable> OWNER TO { <replaceable>new_owner</replaceable> | CURRENT_USER | SESSION_USER }
 ALTER COLLATION <replaceable>name</replaceable> SET SCHEMA <replaceable>new_schema</replaceable>
@@ -85,9 +87,62 @@ <title>Parameters</title>
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term><literal>REFRESH VERSION</literal></term>
+    <listitem>
+     <para>
+      Updated the collation version.
+      See <xref linkend="sql-altercollation-notes"> below.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </refsect1>
 
+ <refsect1 id="sql-altercollation-notes">
+  <title>Notes</title>
+
+  <para>
+   When using collations provided by the ICU library, the ICU-specific version
+   of the collator is recorded in the system catalog when the collation object
+   is created.  When the collation is then used, the current version is
+   checked against the recorded version, and a warning is issued when there is
+   a mismatch, for example:
+<screen>
+WARNING:  ICU collator version mismatch
+DETAIL:  The database was created using version 1.2.3.4, the library provides version 2.3.4.5.
+HINT:  Rebuild all objects affected by this collation and run ALTER COLLATION pg_catalog."xx-x-icu" REFRESH VERSION, or build PostgreSQL with the right version of ICU.
+</screen>
+   A change in collation definitions can lead to corrupt indexes and other
+   problems where the database system relies on stored objects having a
+   certain sort order.  Generally, this should be avoided, but it can happen
+   in legitimate circumstances, such as when
+   using <command>pg_upgrade</command> to upgrade to server binaries linked
+   with a newer version of ICU.  When this happens, all objects depending on
+   the collation should be rebuilt, for example,
+   using <command>REINDEX</command>.  When that is done, the collation version
+   can be refreshed using the command <literal>ALTER COLLATION ... REFRESH
+   VERSION</literal>.  This will update the system catalog to record the
+   current collator version and will make the warning go away.  Note that this
+   does not actually check whether all affected objects have been rebuilt
+   correctly.
+  </para>
+
+  <para>
+   The following query can be used to identify all collations in the current
+   database that need to be refreshed and the objects that depend on them:
+<programlisting><![CDATA[
+SELECT pg_describe_object(refclassid, refobjid, refobjsubid) AS "Collation",
+       pg_describe_object(classid, objid, objsubid) AS "Object"
+  FROM pg_depend d JOIN pg_collation c
+       ON refclassid = 'pg_collation'::regclass AND refobjid = c.oid
+  WHERE c.collversion <> pg_collation_actual_version(c.oid)
+  ORDER BY 1, 2;
+]]></programlisting>
+  </para>
+ </refsect1>
+
  <refsect1>
   <title>Examples</title>
 
diff --git a/doc/src/sgml/ref/create_collation.sgml b/doc/src/sgml/ref/create_collation.sgml
index c09e5bd6d4..47de9a09b6 100644
--- a/doc/src/sgml/ref/create_collation.sgml
+++ b/doc/src/sgml/ref/create_collation.sgml
@@ -21,7 +21,9 @@
 CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> (
     [ LOCALE = <replaceable>locale</replaceable>, ]
     [ LC_COLLATE = <replaceable>lc_collate</replaceable>, ]
-    [ LC_CTYPE = <replaceable>lc_ctype</replaceable> ]
+    [ LC_CTYPE = <replaceable>lc_ctype</replaceable>, ]
+    [ PROVIDER = <replaceable>provider</replaceable>, ]
+    [ VERSION = <replaceable>version</replaceable> ]
 )
 CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replaceable>existing_collation</replaceable>
 </synopsis>
@@ -114,6 +116,39 @@ <title>Parameters</title>
     </varlistentry>
 
     <varlistentry>
+     <term><replaceable>provider</replaceable></term>
+
+     <listitem>
+      <para>
+       Specifies the provider to use for locale services associated with this
+       collation.  Possible values
+       are: <literal>icu</literal>,<indexterm><primary>ICU</></> <literal>libc</literal>.
+       The available choices depend on the operating system and build options.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
+     <term><replaceable>version</replaceable></term>
+
+     <listitem>
+      <para>
+       Specifies the version string to store with the collation.  Normally,
+       this should be omitted, which will cause the version to be computed
+       from the actual version of the collation as provided by the operating
+       system.  This option is intended to be used
+       by <command>pg_upgrade</command> for copying the version from an
+       existing installation.
+      </para>
+
+      <para>
+       See also <xref linkend="sql-altercollation"> for how to handle
+       collation version mismatches.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
      <term><replaceable>existing_collation</replaceable></term>
 
      <listitem>
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 8e1d6e3bd4..4acf7d2f06 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -179,6 +179,7 @@ pgxsdir = $(pkglibdir)/pgxs
 #
 # Records the choice of the various --enable-xxx and --with-xxx options.
 
+with_icu	= @with_icu@
 with_perl	= @with_perl@
 with_python	= @with_python@
 with_tcl	= @with_tcl@
@@ -208,6 +209,9 @@ python_version		= @python_version@
 
 krb_srvtab = @krb_srvtab@
 
+ICU_CFLAGS		= @ICU_CFLAGS@
+ICU_LIBS		= @ICU_LIBS@
+
 TCLSH			= @TCLSH@
 TCL_LIBS		= @TCL_LIBS@
 TCL_LIB_SPEC		= @TCL_LIB_SPEC@
diff --git a/src/backend/Makefile b/src/backend/Makefile
index 7a0bbb2942..fffb0d95ba 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -58,7 +58,7 @@ ifneq ($(PORTNAME), win32)
 ifneq ($(PORTNAME), aix)
 
 postgres: $(OBJS)
-	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) -o $@
+	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) $(ICU_LIBS) -o $@
 
 endif
 endif
diff --git a/src/backend/catalog/pg_collation.c b/src/backend/catalog/pg_collation.c
index 65b6051c0d..ede920955d 100644
--- a/src/backend/catalog/pg_collation.c
+++ b/src/backend/catalog/pg_collation.c
@@ -27,6 +27,7 @@
 #include "mb/pg_wchar.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
+#include "utils/pg_locale.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
@@ -40,8 +41,10 @@
 Oid
 CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
 				const char *collcollate, const char *collctype,
+				const char *collversion,
 				bool if_not_exists)
 {
 	Relation	rel;
@@ -78,29 +81,47 @@ CollationCreate(const char *collname, Oid collnamespace,
 		{
 			ereport(NOTICE,
 				(errcode(ERRCODE_DUPLICATE_OBJECT),
-				 errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping",
-						collname, pg_encoding_to_char(collencoding))));
+				 collencoding == -1
+				 ? errmsg("collation \"%s\" already exists, skipping",
+						  collname)
+				 : errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping",
+						  collname, pg_encoding_to_char(collencoding))));
 			return InvalidOid;
 		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_DUPLICATE_OBJECT),
-					 errmsg("collation \"%s\" for encoding \"%s\" already exists",
-							collname, pg_encoding_to_char(collencoding))));
+					 collencoding == -1
+					 ? errmsg("collation \"%s\" already exists",
+							  collname)
+					 : errmsg("collation \"%s\" for encoding \"%s\" already exists",
+							  collname, pg_encoding_to_char(collencoding))));
 	}
 
+	/* open pg_collation; see below about the lock level */
+	rel = heap_open(CollationRelationId, ShareRowExclusiveLock);
+
 	/*
-	 * Also forbid matching an any-encoding entry.  This test of course is not
-	 * backed up by the unique index, but it's not a problem since we don't
-	 * support adding any-encoding entries after initdb.
+	 * Also forbid a specific-encoding collation shadowing an any-encoding
+	 * collation, or an any-encoding collation being shadowed (see
+	 * get_collation_name()).  This test is not backed up by the unique index,
+	 * so we take a ShareRowExclusiveLock earlier, to protect against
+	 * concurrent changes fooling this check.
 	 */
-	if (SearchSysCacheExists3(COLLNAMEENCNSP,
-							  PointerGetDatum(collname),
-							  Int32GetDatum(-1),
-							  ObjectIdGetDatum(collnamespace)))
+	if ((collencoding == -1 &&
+		 SearchSysCacheExists3(COLLNAMEENCNSP,
+							   PointerGetDatum(collname),
+							   Int32GetDatum(GetDatabaseEncoding()),
+							   ObjectIdGetDatum(collnamespace))) ||
+		(collencoding != -1 &&
+		 SearchSysCacheExists3(COLLNAMEENCNSP,
+							   PointerGetDatum(collname),
+							   Int32GetDatum(-1),
+							   ObjectIdGetDatum(collnamespace))))
 	{
 		if (if_not_exists)
 		{
+			heap_close(rel, NoLock);
 			ereport(NOTICE,
 				(errcode(ERRCODE_DUPLICATE_OBJECT),
 				 errmsg("collation \"%s\" already exists, skipping",
@@ -114,8 +135,6 @@ CollationCreate(const char *collname, Oid collnamespace,
 						collname)));
 	}
 
-	/* open pg_collation */
-	rel = heap_open(CollationRelationId, RowExclusiveLock);
 	tupDesc = RelationGetDescr(rel);
 
 	/* form a tuple */
@@ -125,11 +144,16 @@ CollationCreate(const char *collname, Oid collnamespace,
 	values[Anum_pg_collation_collname - 1] = NameGetDatum(&name_name);
 	values[Anum_pg_collation_collnamespace - 1] = ObjectIdGetDatum(collnamespace);
 	values[Anum_pg_collation_collowner - 1] = ObjectIdGetDatum(collowner);
+	values[Anum_pg_collation_collprovider - 1] = CharGetDatum(collprovider);
 	values[Anum_pg_collation_collencoding - 1] = Int32GetDatum(collencoding);
 	namestrcpy(&name_collate, collcollate);
 	values[Anum_pg_collation_collcollate - 1] = NameGetDatum(&name_collate);
 	namestrcpy(&name_ctype, collctype);
 	values[Anum_pg_collation_collctype - 1] = NameGetDatum(&name_ctype);
+	if (collversion)
+		values[Anum_pg_collation_collversion - 1] = CStringGetTextDatum(collversion);
+	else
+		nulls[Anum_pg_collation_collversion - 1] = true;
 
 	tup = heap_form_tuple(tupDesc, values, nulls);
 
@@ -159,7 +183,7 @@ CollationCreate(const char *collname, Oid collnamespace,
 	InvokeObjectPostCreateHook(CollationRelationId, oid, 0);
 
 	heap_freetuple(tup);
-	heap_close(rel, RowExclusiveLock);
+	heap_close(rel, NoLock);
 
 	return oid;
 }
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 919cfc6a06..835cb263db 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -14,15 +14,18 @@
  */
 #include "postgres.h"
 
+#include "access/heapam.h"
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/dependency.h"
 #include "catalog/indexing.h"
 #include "catalog/namespace.h"
+#include "catalog/objectaccess.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_collation_fn.h"
 #include "commands/alter.h"
 #include "commands/collationcmds.h"
+#include "commands/comment.h"
 #include "commands/dbcommands.h"
 #include "commands/defrem.h"
 #include "mb/pg_wchar.h"
@@ -33,6 +36,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
+
 /*
  * CREATE COLLATION
  */
@@ -47,8 +51,14 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 	DefElem    *localeEl = NULL;
 	DefElem    *lccollateEl = NULL;
 	DefElem    *lcctypeEl = NULL;
+	DefElem    *providerEl = NULL;
+	DefElem    *versionEl = NULL;
 	char	   *collcollate = NULL;
 	char	   *collctype = NULL;
+	char	   *collproviderstr = NULL;
+	int			collencoding;
+	char		collprovider = 0;
+	char	   *collversion = NULL;
 	Oid			newoid;
 	ObjectAddress address;
 
@@ -72,6 +82,10 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 			defelp = &lccollateEl;
 		else if (pg_strcasecmp(defel->defname, "lc_ctype") == 0)
 			defelp = &lcctypeEl;
+		else if (pg_strcasecmp(defel->defname, "provider") == 0)
+			defelp = &providerEl;
+		else if (pg_strcasecmp(defel->defname, "version") == 0)
+			defelp = &versionEl;
 		else
 		{
 			ereport(ERROR,
@@ -103,6 +117,7 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 
 		collcollate = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collcollate));
 		collctype = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collctype));
+		collprovider = ((Form_pg_collation) GETSTRUCT(tp))->collprovider;
 
 		ReleaseSysCache(tp);
 	}
@@ -119,6 +134,27 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 	if (lcctypeEl)
 		collctype = defGetString(lcctypeEl);
 
+	if (providerEl)
+		collproviderstr = defGetString(providerEl);
+
+	if (versionEl)
+		collversion = defGetString(versionEl);
+
+	if (collproviderstr)
+	{
+		if (pg_strcasecmp(collproviderstr, "icu") == 0)
+			collprovider = COLLPROVIDER_ICU;
+		else if (pg_strcasecmp(collproviderstr, "libc") == 0)
+			collprovider = COLLPROVIDER_LIBC;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+					 errmsg("unrecognized collation provider: %s",
+							collproviderstr)));
+	}
+	else if (!fromEl)
+		collprovider = COLLPROVIDER_LIBC;
+
 	if (!collcollate)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
@@ -129,14 +165,25 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
 				 errmsg("parameter \"lc_ctype\" must be specified")));
 
-	check_encoding_locale_matches(GetDatabaseEncoding(), collcollate, collctype);
+	if (collprovider == COLLPROVIDER_ICU)
+		collencoding = -1;
+	else
+	{
+		collencoding = GetDatabaseEncoding();
+		check_encoding_locale_matches(collencoding, collcollate, collctype);
+	}
+
+	if (!collversion)
+		collversion = get_collation_actual_version(collprovider, collcollate);
 
 	newoid = CollationCreate(collName,
 							 collNamespace,
 							 GetUserId(),
-							 GetDatabaseEncoding(),
+							 collprovider,
+							 collencoding,
 							 collcollate,
 							 collctype,
+							 collversion,
 							 if_not_exists);
 
 	if (!OidIsValid(newoid))
@@ -182,16 +229,118 @@ IsThereCollationInNamespace(const char *collname, Oid nspOid)
 						collname, get_namespace_name(nspOid))));
 }
 
+/*
+ * ALTER COLLATION
+ */
+ObjectAddress
+AlterCollation(AlterCollationStmt *stmt)
+{
+	Relation	rel;
+	Oid			collOid;
+	HeapTuple	tup;
+	Form_pg_collation collForm;
+	Datum		collversion;
+	bool		isnull;
+	char	   *oldversion;
+	char	   *newversion;
+	ObjectAddress address;
+
+	rel = heap_open(CollationRelationId, RowExclusiveLock);
+	collOid = get_collation_oid(stmt->collname, false);
+
+	if (!pg_collation_ownercheck(collOid, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_COLLATION,
+					   NameListToString(stmt->collname));
+
+	tup = SearchSysCacheCopy1(COLLOID, ObjectIdGetDatum(collOid));
+	if (!HeapTupleIsValid(tup))
+		elog(ERROR, "cache lookup failed for collation %u", collOid);
+
+	collForm = (Form_pg_collation) GETSTRUCT(tup);
+	collversion = SysCacheGetAttr(COLLOID, tup, Anum_pg_collation_collversion,
+								  &isnull);
+	oldversion = isnull ? NULL : TextDatumGetCString(collversion);
+
+	newversion = get_collation_actual_version(collForm->collprovider, NameStr(collForm->collcollate));
+
+	/* cannot change from NULL to non-NULL or vice versa */
+	if ((!oldversion && newversion) || (oldversion && !newversion))
+		elog(ERROR, "invalid collation version change");
+	else if (oldversion && newversion && strcmp(newversion, oldversion) != 0)
+	{
+		bool        nulls[Natts_pg_collation];
+		bool        replaces[Natts_pg_collation];
+		Datum       values[Natts_pg_collation];
+
+		ereport(NOTICE,
+				(errmsg("changing version from %s to %s",
+						oldversion, newversion)));
+
+		memset(values, 0, sizeof(values));
+		memset(nulls, false, sizeof(nulls));
+		memset(replaces, false, sizeof(replaces));
+
+		values[Anum_pg_collation_collversion - 1] = CStringGetTextDatum(newversion);
+		replaces[Anum_pg_collation_collversion - 1] = true;
+
+		tup = heap_modify_tuple(tup, RelationGetDescr(rel),
+								values, nulls, replaces);
+	}
+	else
+		ereport(NOTICE,
+				(errmsg("version has not changed")));
+
+	CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+	InvokeObjectPostAlterHook(CollationRelationId, collOid, 0);
+
+	ObjectAddressSet(address, CollationRelationId, collOid);
+
+	heap_freetuple(tup);
+	heap_close(rel, NoLock);
+
+	return address;
+}
+
+
+Datum
+pg_collation_actual_version(PG_FUNCTION_ARGS)
+{
+	Oid			collid = PG_GETARG_OID(0);
+	HeapTuple	tp;
+	char	   *collcollate;
+	char		collprovider;
+	char	   *version;
+
+	tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collid));
+	if (!HeapTupleIsValid(tp))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_OBJECT),
+				 errmsg("collation with OID %u does not exist", collid)));
+
+	collcollate = pstrdup(NameStr(((Form_pg_collation) GETSTRUCT(tp))->collcollate));
+	collprovider = ((Form_pg_collation) GETSTRUCT(tp))->collprovider;
+
+	ReleaseSysCache(tp);
+
+	version = get_collation_actual_version(collprovider, collcollate);
+
+	if (version)
+		PG_RETURN_TEXT_P(cstring_to_text(version));
+	else
+		PG_RETURN_NULL();
+}
+
 
 /*
- * "Normalize" a locale name, stripping off encoding tags such as
+ * "Normalize" a libc locale name, stripping off encoding tags such as
  * ".utf8" (e.g., "en_US.utf8" -> "en_US", but "br_FR.iso885915@euro"
  * -> "br_FR@euro").  Return true if a new, different name was
  * generated.
  */
 pg_attribute_unused()
 static bool
-normalize_locale_name(char *new, const char *old)
+normalize_libc_locale_name(char *new, const char *old)
 {
 	char	   *n = new;
 	const char *o = old;
@@ -219,6 +368,46 @@ normalize_locale_name(char *new, const char *old)
 }
 
 
+#ifdef USE_ICU
+static char *
+get_icu_language_tag(const char *localename)
+{
+	char		buf[ULOC_FULLNAME_CAPACITY];
+	UErrorCode	status;
+
+	status = U_ZERO_ERROR;
+	uloc_toLanguageTag(localename, buf, sizeof(buf), TRUE, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("could not convert locale name \"%s\" to language tag: %s",
+						localename, u_errorName(status))));
+
+	return pstrdup(buf);
+}
+
+
+static char *
+get_icu_locale_comment(const char *localename)
+{
+	UErrorCode	status;
+	UChar		displayname[128];
+	int32		len_uchar;
+	char	   *result;
+
+	status = U_ZERO_ERROR;
+	len_uchar = uloc_getDisplayName(localename, "en", &displayname[0], sizeof(displayname), &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("could get display name for locale \"%s\": %s",
+						localename, u_errorName(status))));
+
+	icu_from_uchar(&result, displayname, len_uchar);
+
+	return result;
+}
+#endif	/* USE_ICU */
+
+
 Datum
 pg_import_system_collations(PG_FUNCTION_ARGS)
 {
@@ -302,8 +491,10 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 
 		count++;
 
-		CollationCreate(localebuf, nspid, GetUserId(), enc,
-						localebuf, localebuf, if_not_exists);
+		CollationCreate(localebuf, nspid, GetUserId(), COLLPROVIDER_LIBC, enc,
+						localebuf, localebuf,
+						get_collation_actual_version(COLLPROVIDER_LIBC, localebuf),
+						if_not_exists);
 
 		CommandCounterIncrement();
 
@@ -316,7 +507,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 		 * "locale -a" output.  So save up the aliases and try to add them
 		 * after we've read all the output.
 		 */
-		if (normalize_locale_name(alias, localebuf))
+		if (normalize_libc_locale_name(alias, localebuf))
 		{
 			aliaslist = lappend(aliaslist, pstrdup(alias));
 			localelist = lappend(localelist, pstrdup(localebuf));
@@ -333,8 +524,10 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 		char	   *locale = (char *) lfirst(lcl);
 		int			enc = lfirst_int(lce);
 
-		CollationCreate(alias, nspid, GetUserId(), enc,
-						locale, locale, true);
+		CollationCreate(alias, nspid, GetUserId(), COLLPROVIDER_LIBC, enc,
+						locale, locale,
+						get_collation_actual_version(COLLPROVIDER_LIBC, locale),
+						true);
 		CommandCounterIncrement();
 	}
 
@@ -343,5 +536,82 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 				(errmsg("no usable system locales were found")));
 #endif   /* not HAVE_LOCALE_T && not WIN32 */
 
+#ifdef USE_ICU
+	if (!is_encoding_supported_by_icu(GetDatabaseEncoding()))
+	{
+		ereport(NOTICE,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("encoding \"%s\" not supported by ICU",
+						pg_encoding_to_char(GetDatabaseEncoding()))));
+	}
+	else
+	{
+		int i;
+
+		/*
+		 * Start the loop at -1 to sneak in the root locale without too much
+		 * code duplication.
+		 */
+		for (i = -1; i < ucol_countAvailable(); i++)
+		{
+			const char *name;
+			char	   *langtag;
+			const char *collcollate;
+			UEnumeration *en;
+			UErrorCode	status;
+			const char *val;
+			Oid			collid;
+
+			if (i == -1)
+				name = "";  /* ICU root locale */
+			else
+				name = ucol_getAvailable(i);
+
+			langtag = get_icu_language_tag(name);
+			collcollate = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : name;
+			collid = CollationCreate(psprintf("%s-x-icu", langtag),
+									 nspid, GetUserId(), COLLPROVIDER_ICU, -1,
+									 collcollate, collcollate,
+									 get_collation_actual_version(COLLPROVIDER_ICU, collcollate),
+									 if_not_exists);
+
+			CreateComments(collid, CollationRelationId, 0,
+						   get_icu_locale_comment(name));
+
+			/*
+			 * Add keyword variants
+			 */
+			status = U_ZERO_ERROR;
+			en = ucol_getKeywordValuesForLocale("collation", name, TRUE, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not get keyword values for locale \"%s\": %s",
+								name, u_errorName(status))));
+
+			status = U_ZERO_ERROR;
+			uenum_reset(en, &status);
+			while ((val = uenum_next(en, NULL, &status)))
+			{
+				char *localeid = psprintf("%s@collation=%s", name, val);
+
+				langtag =  get_icu_language_tag(localeid);
+				collcollate = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : localeid;
+				collid = CollationCreate(psprintf("%s-x-icu", langtag),
+										 nspid, GetUserId(), COLLPROVIDER_ICU, -1,
+										 collcollate, collcollate,
+										 get_collation_actual_version(COLLPROVIDER_ICU, collcollate),
+										 if_not_exists);
+				CreateComments(collid, CollationRelationId, 0,
+							   get_icu_locale_comment(localeid));
+			}
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not get keyword values for locale \"%s\": %s",
+								name, u_errorName(status))));
+			uenum_close(en);
+		}
+	}
+#endif
+
 	PG_RETURN_VOID();
 }
diff --git a/src/backend/common.mk b/src/backend/common.mk
index 5d599dbd0c..0b57543bc4 100644
--- a/src/backend/common.mk
+++ b/src/backend/common.mk
@@ -8,6 +8,8 @@
 # this directory and SUBDIRS to subdirectories containing more things
 # to build.
 
+override CPPFLAGS := $(CPPFLAGS) $(ICU_CFLAGS)
+
 ifdef PARTIAL_LINKING
 # old style: linking using SUBSYS.o
 subsysfilename = SUBSYS.o
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 25fd051d6e..ac48903126 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3028,6 +3028,16 @@ _copyAlterTableCmd(const AlterTableCmd *from)
 	return newnode;
 }
 
+static AlterCollationStmt *
+_copyAlterCollationStmt(const AlterCollationStmt *from)
+{
+	AlterCollationStmt *newnode = makeNode(AlterCollationStmt);
+
+	COPY_NODE_FIELD(collname);
+
+	return newnode;
+}
+
 static AlterDomainStmt *
 _copyAlterDomainStmt(const AlterDomainStmt *from)
 {
@@ -4960,6 +4970,9 @@ copyObject(const void *from)
 		case T_AlterTableCmd:
 			retval = _copyAlterTableCmd(from);
 			break;
+		case T_AlterCollationStmt:
+			retval = _copyAlterCollationStmt(from);
+			break;
 		case T_AlterDomainStmt:
 			retval = _copyAlterDomainStmt(from);
 			break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 67529e3f86..21547345d1 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1087,6 +1087,14 @@ _equalAlterTableCmd(const AlterTableCmd *a, const AlterTableCmd *b)
 }
 
 static bool
+_equalAlterCollationStmt(const AlterCollationStmt *a, const AlterCollationStmt *b)
+{
+	COMPARE_NODE_FIELD(collname);
+
+	return true;
+}
+
+static bool
 _equalAlterDomainStmt(const AlterDomainStmt *a, const AlterDomainStmt *b)
 {
 	COMPARE_SCALAR_FIELD(subtype);
@@ -3157,6 +3165,9 @@ equal(const void *a, const void *b)
 		case T_AlterTableCmd:
 			retval = _equalAlterTableCmd(a, b);
 			break;
+		case T_AlterCollationStmt:
+			retval = _equalAlterCollationStmt(a, b);
+			break;
 		case T_AlterDomainStmt:
 			retval = _equalAlterDomainStmt(a, b);
 			break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 6316688a88..5a7e7aff21 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -244,7 +244,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 }
 
 %type <node>	stmt schema_stmt
-		AlterEventTrigStmt
+		AlterEventTrigStmt AlterCollationStmt
 		AlterDatabaseStmt AlterDatabaseSetStmt AlterDomainStmt AlterEnumStmt
 		AlterFdwStmt AlterForeignServerStmt AlterGroupStmt
 		AlterObjectDependsStmt AlterObjectSchemaStmt AlterOwnerStmt
@@ -812,6 +812,7 @@ stmtmulti:	stmtmulti ';' stmt
 
 stmt :
 			AlterEventTrigStmt
+			| AlterCollationStmt
 			| AlterDatabaseStmt
 			| AlterDatabaseSetStmt
 			| AlterDefaultPrivilegesStmt
@@ -9660,6 +9661,21 @@ DropdbStmt: DROP DATABASE database_name
 
 /*****************************************************************************
  *
+ *		ALTER COLLATION
+ *
+ *****************************************************************************/
+
+AlterCollationStmt: ALTER COLLATION any_name REFRESH VERSION_P
+				{
+					AlterCollationStmt *n = makeNode(AlterCollationStmt);
+					n->collname = $3;
+					$$ = (Node *)n;
+				}
+		;
+
+
+/*****************************************************************************
+ *
  *		ALTER SYSTEM
  *
  * This is used to change configuration parameters persistently.
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index 0121cbb2ad..4bdcb4fd6a 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -68,7 +68,8 @@ typedef enum
 	PG_REGEX_LOCALE_WIDE,		/* Use <wctype.h> functions */
 	PG_REGEX_LOCALE_1BYTE,		/* Use <ctype.h> functions */
 	PG_REGEX_LOCALE_WIDE_L,		/* Use locale_t <wctype.h> functions */
-	PG_REGEX_LOCALE_1BYTE_L		/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_1BYTE_L,	/* Use locale_t <ctype.h> functions */
+	PG_REGEX_LOCALE_ICU			/* Use ICU uchar.h functions */
 } PG_Locale_Strategy;
 
 static PG_Locale_Strategy pg_regex_strategy;
@@ -262,6 +263,11 @@ pg_set_regex_collation(Oid collation)
 					 errhint("Use the COLLATE clause to set the collation explicitly.")));
 		}
 
+#ifdef USE_ICU
+		if (pg_regex_locale && pg_regex_locale->provider == COLLPROVIDER_ICU)
+			pg_regex_strategy = PG_REGEX_LOCALE_ICU;
+		else
+#endif
 #ifdef USE_WIDE_UPPER_LOWER
 		if (GetDatabaseEncoding() == PG_UTF8)
 		{
@@ -303,13 +309,18 @@ pg_wc_isdigit(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswdigit_l((wint_t) c, pg_regex_locale);
+				return iswdigit_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isdigit_l((unsigned char) c, pg_regex_locale));
+					isdigit_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isdigit(c);
 #endif
 			break;
 	}
@@ -336,13 +347,18 @@ pg_wc_isalpha(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalpha_l((wint_t) c, pg_regex_locale);
+				return iswalpha_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalpha_l((unsigned char) c, pg_regex_locale));
+					isalpha_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalpha(c);
 #endif
 			break;
 	}
@@ -369,13 +385,18 @@ pg_wc_isalnum(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswalnum_l((wint_t) c, pg_regex_locale);
+				return iswalnum_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isalnum_l((unsigned char) c, pg_regex_locale));
+					isalnum_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isalnum(c);
 #endif
 			break;
 	}
@@ -402,13 +423,18 @@ pg_wc_isupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswupper_l((wint_t) c, pg_regex_locale);
+				return iswupper_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isupper_l((unsigned char) c, pg_regex_locale));
+					isupper_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isupper(c);
 #endif
 			break;
 	}
@@ -435,13 +461,18 @@ pg_wc_islower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswlower_l((wint_t) c, pg_regex_locale);
+				return iswlower_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					islower_l((unsigned char) c, pg_regex_locale));
+					islower_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_islower(c);
 #endif
 			break;
 	}
@@ -468,13 +499,18 @@ pg_wc_isgraph(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswgraph_l((wint_t) c, pg_regex_locale);
+				return iswgraph_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isgraph_l((unsigned char) c, pg_regex_locale));
+					isgraph_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isgraph(c);
 #endif
 			break;
 	}
@@ -501,13 +537,18 @@ pg_wc_isprint(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswprint_l((wint_t) c, pg_regex_locale);
+				return iswprint_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isprint_l((unsigned char) c, pg_regex_locale));
+					isprint_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isprint(c);
 #endif
 			break;
 	}
@@ -534,13 +575,18 @@ pg_wc_ispunct(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswpunct_l((wint_t) c, pg_regex_locale);
+				return iswpunct_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					ispunct_l((unsigned char) c, pg_regex_locale));
+					ispunct_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_ispunct(c);
 #endif
 			break;
 	}
@@ -567,13 +613,18 @@ pg_wc_isspace(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return iswspace_l((wint_t) c, pg_regex_locale);
+				return iswspace_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			return (c <= (pg_wchar) UCHAR_MAX &&
-					isspace_l((unsigned char) c, pg_regex_locale));
+					isspace_l((unsigned char) c, pg_regex_locale->info.lt));
+#endif
+			break;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_isspace(c);
 #endif
 			break;
 	}
@@ -608,15 +659,20 @@ pg_wc_toupper(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towupper_l((wint_t) c, pg_regex_locale);
+				return towupper_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return toupper_l((unsigned char) c, pg_regex_locale);
+				return toupper_l((unsigned char) c, pg_regex_locale->info.lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_toupper(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -649,15 +705,20 @@ pg_wc_tolower(pg_wchar c)
 		case PG_REGEX_LOCALE_WIDE_L:
 #if defined(HAVE_LOCALE_T) && defined(USE_WIDE_UPPER_LOWER)
 			if (sizeof(wchar_t) >= 4 || c <= (pg_wchar) 0xFFFF)
-				return towlower_l((wint_t) c, pg_regex_locale);
+				return towlower_l((wint_t) c, pg_regex_locale->info.lt);
 #endif
 			/* FALL THRU */
 		case PG_REGEX_LOCALE_1BYTE_L:
 #ifdef HAVE_LOCALE_T
 			if (c <= (pg_wchar) UCHAR_MAX)
-				return tolower_l((unsigned char) c, pg_regex_locale);
+				return tolower_l((unsigned char) c, pg_regex_locale->info.lt);
 #endif
 			return c;
+		case PG_REGEX_LOCALE_ICU:
+#ifdef USE_ICU
+			return u_tolower(c);
+#endif
+			break;
 	}
 	return 0;					/* can't get here, but keep compiler quiet */
 }
@@ -808,6 +869,9 @@ pg_ctype_get_cache(pg_wc_probefunc probefunc, int cclasscode)
 			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
 #endif
 			break;
+		case PG_REGEX_LOCALE_ICU:
+			max_chr = (pg_wchar) MAX_SIMPLE_CHR;
+			break;
 		default:
 			max_chr = 0;		/* can't get here, but keep compiler quiet */
 			break;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 20b5273405..c8d20fffea 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1623,6 +1623,10 @@ ProcessUtilitySlow(ParseState *pstate,
 				commandCollected = true;
 				break;
 
+			case T_AlterCollationStmt:
+				address = AlterCollation((AlterCollationStmt *) parsetree);
+				break;
+
 			default:
 				elog(ERROR, "unrecognized node type: %d",
 					 (int) nodeTag(parsetree));
@@ -2673,6 +2677,10 @@ CreateCommandTag(Node *parsetree)
 			tag = "DROP SUBSCRIPTION";
 			break;
 
+		case T_AlterCollationStmt:
+			tag = "ALTER COLLATION";
+			break;
+
 		case T_PrepareStmt:
 			tag = "PREPARE";
 			break;
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index c16bfbca93..0566abd314 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -82,6 +82,10 @@
 #include <wctype.h>
 #endif
 
+#ifdef USE_ICU
+#include <unicode/ustring.h>
+#endif
+
 #include "catalog/pg_collation.h"
 #include "mb/pg_wchar.h"
 #include "utils/builtins.h"
@@ -1443,6 +1447,42 @@ str_numth(char *dest, char *num, int type)
  *			upper/lower/initcap functions
  *****************************************************************************/
 
+#ifdef USE_ICU
+static int32_t
+icu_convert_case(int32_t (*func)(UChar *, int32_t, const UChar *, int32_t, const char *, UErrorCode *),
+				 pg_locale_t mylocale, UChar **buff_dest, UChar *buff_source, int32_t len_source)
+{
+	UErrorCode	status;
+	int32_t		len_dest;
+
+	len_dest = len_source;  /* try first with same length */
+	*buff_dest = palloc(len_dest * sizeof(**buff_dest));
+	status = U_ZERO_ERROR;
+	len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->info.icu.locale, &status);
+	if (status == U_BUFFER_OVERFLOW_ERROR)
+	{
+		/* try again with adjusted length */
+		pfree(buff_dest);
+		buff_dest = palloc(len_dest * sizeof(**buff_dest));
+		status = U_ZERO_ERROR;
+		len_dest = func(*buff_dest, len_dest, buff_source, len_source, mylocale->info.icu.locale, &status);
+	}
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("case conversion failed: %s", u_errorName(status))));
+	return len_dest;
+}
+
+static int32_t
+u_strToTitle_default_BI(UChar *dest, int32_t destCapacity,
+						const UChar *src, int32_t srcLength,
+						const char *locale,
+						UErrorCode *pErrorCode)
+{
+	return u_strToTitle(dest, destCapacity, src, srcLength, NULL, locale, pErrorCode);
+}
+#endif
+
 /*
  * If the system provides the needed functions for wide-character manipulation
  * (which are all standardized by C99), then we implement upper/lower/initcap
@@ -1479,12 +1519,9 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 		result = asc_tolower(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1502,77 +1539,79 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar;
+			int32_t		len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToLower, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-			else
+					if (mylocale)
+						workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->info.lt);
+					else
 #endif
-				workspace[curr_char] = towlower(workspace[curr_char]);
-		}
+						workspace[curr_char] = towlower(workspace[curr_char]);
+				}
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
+			}
 #endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
-
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
+			else
 			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for lower() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that tolower_l() will not be so broken as to need
-		 * an isupper_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that tolower_l() will not be so broken as to need
+				 * an isupper_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = tolower_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = tolower_l((unsigned char) *p, mylocale->info.lt);
+					else
 #endif
-				*p = pg_tolower((unsigned char) *p);
+						*p = pg_tolower((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1599,12 +1638,9 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 		result = asc_toupper(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1622,77 +1658,78 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
+		{
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
 
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToUpper, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
+#endif
+		{
+			if (pg_database_encoding_max_length() > 1)
+			{
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-				workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-			else
-#endif
-				workspace[curr_char] = towupper(workspace[curr_char]);
-		}
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
+					if (mylocale)
+						workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->info.lt);
+					else
 #endif
-		char	   *p;
+						workspace[curr_char] = towupper(workspace[curr_char]);
+				}
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for upper() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
+
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
-#ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
-#endif
-		}
+#endif   /* USE_WIDE_UPPER_LOWER */
+			else
+			{
+				char	   *p;
 
-		result = pnstrdup(buff, nbytes);
+				result = pnstrdup(buff, nbytes);
 
-		/*
-		 * Note: we assume that toupper_l() will not be so broken as to need
-		 * an islower_l() guard test.  When using the default collation, we
-		 * apply the traditional Postgres behavior that forces ASCII-style
-		 * treatment of I/i, but in non-default collations you get exactly
-		 * what the collation says.
-		 */
-		for (p = result; *p; p++)
-		{
+				/*
+				 * Note: we assume that toupper_l() will not be so broken as to need
+				 * an islower_l() guard test.  When using the default collation, we
+				 * apply the traditional Postgres behavior that forces ASCII-style
+				 * treatment of I/i, but in non-default collations you get exactly
+				 * what the collation says.
+				 */
+				for (p = result; *p; p++)
+				{
 #ifdef HAVE_LOCALE_T
-			if (mylocale)
-				*p = toupper_l((unsigned char) *p, mylocale);
-			else
+					if (mylocale)
+						*p = toupper_l((unsigned char) *p, mylocale->info.lt);
+					else
 #endif
-				*p = pg_toupper((unsigned char) *p);
+						*p = pg_toupper((unsigned char) *p);
+				}
+			}
 		}
 	}
 
@@ -1720,12 +1757,9 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 		result = asc_initcap(buff, nbytes);
 	}
 #ifdef USE_WIDE_UPPER_LOWER
-	else if (pg_database_encoding_max_length() > 1)
+	else
 	{
 		pg_locale_t mylocale = 0;
-		wchar_t    *workspace;
-		size_t		curr_char;
-		size_t		result_size;
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1743,100 +1777,101 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
 			mylocale = pg_newlocale_from_collation(collid);
 		}
 
-		/* Overflow paranoia */
-		if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
-			ereport(ERROR,
-					(errcode(ERRCODE_OUT_OF_MEMORY),
-					 errmsg("out of memory")));
-
-		/* Output workspace cannot have more codes than input bytes */
-		workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
-
-		char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
-
-		for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+#ifdef USE_ICU
+		if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
 		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					workspace[curr_char] = towlower_l(workspace[curr_char], mylocale);
-				else
-					workspace[curr_char] = towupper_l(workspace[curr_char], mylocale);
-				wasalnum = iswalnum_l(workspace[curr_char], mylocale);
-			}
-			else
+			int32_t		len_uchar, len_conv;
+			UChar	   *buff_uchar;
+			UChar	   *buff_conv;
+
+			len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes);
+			len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale, &buff_conv, buff_uchar, len_uchar);
+			icu_from_uchar(&result, buff_conv, len_conv);
+		}
+		else
 #endif
+		{
+			if (pg_database_encoding_max_length() > 1)
 			{
-				if (wasalnum)
-					workspace[curr_char] = towlower(workspace[curr_char]);
-				else
-					workspace[curr_char] = towupper(workspace[curr_char]);
-				wasalnum = iswalnum(workspace[curr_char]);
-			}
-		}
+				wchar_t    *workspace;
+				size_t		curr_char;
+				size_t		result_size;
 
-		/* Make result large enough; case change might change number of bytes */
-		result_size = curr_char * pg_database_encoding_max_length() + 1;
-		result = palloc(result_size);
+				/* Overflow paranoia */
+				if ((nbytes + 1) > (INT_MAX / sizeof(wchar_t)))
+					ereport(ERROR,
+							(errcode(ERRCODE_OUT_OF_MEMORY),
+							 errmsg("out of memory")));
 
-		wchar2char(result, workspace, result_size, mylocale);
-		pfree(workspace);
-	}
-#endif   /* USE_WIDE_UPPER_LOWER */
-	else
-	{
-#ifdef HAVE_LOCALE_T
-		pg_locale_t mylocale = 0;
-#endif
-		char	   *p;
+				/* Output workspace cannot have more codes than input bytes */
+				workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t));
 
-		if (collid != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collid))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for initcap() function"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
+				char2wchar(workspace, nbytes + 1, buff, nbytes, mylocale);
+
+				for (curr_char = 0; workspace[curr_char] != 0; curr_char++)
+				{
 #ifdef HAVE_LOCALE_T
-			mylocale = pg_newlocale_from_collation(collid);
+					if (mylocale)
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower_l(workspace[curr_char], mylocale->info.lt);
+						else
+							workspace[curr_char] = towupper_l(workspace[curr_char], mylocale->info.lt);
+						wasalnum = iswalnum_l(workspace[curr_char], mylocale->info.lt);
+					}
+					else
 #endif
-		}
+					{
+						if (wasalnum)
+							workspace[curr_char] = towlower(workspace[curr_char]);
+						else
+							workspace[curr_char] = towupper(workspace[curr_char]);
+						wasalnum = iswalnum(workspace[curr_char]);
+					}
+				}
 
-		result = pnstrdup(buff, nbytes);
+				/* Make result large enough; case change might change number of bytes */
+				result_size = curr_char * pg_database_encoding_max_length() + 1;
+				result = palloc(result_size);
 
-		/*
-		 * Note: we assume that toupper_l()/tolower_l() will not be so broken
-		 * as to need guard tests.  When using the default collation, we apply
-		 * the traditional Postgres behavior that forces ASCII-style treatment
-		 * of I/i, but in non-default collations you get exactly what the
-		 * collation says.
-		 */
-		for (p = result; *p; p++)
-		{
-#ifdef HAVE_LOCALE_T
-			if (mylocale)
-			{
-				if (wasalnum)
-					*p = tolower_l((unsigned char) *p, mylocale);
-				else
-					*p = toupper_l((unsigned char) *p, mylocale);
-				wasalnum = isalnum_l((unsigned char) *p, mylocale);
+				wchar2char(result, workspace, result_size, mylocale);
+				pfree(workspace);
 			}
+#endif   /* USE_WIDE_UPPER_LOWER */
 			else
-#endif
 			{
-				if (wasalnum)
-					*p = pg_tolower((unsigned char) *p);
-				else
-					*p = pg_toupper((unsigned char) *p);
-				wasalnum = isalnum((unsigned char) *p);
+				char	   *p;
+
+				result = pnstrdup(buff, nbytes);
+
+				/*
+				 * Note: we assume that toupper_l()/tolower_l() will not be so broken
+				 * as to need guard tests.  When using the default collation, we apply
+				 * the traditional Postgres behavior that forces ASCII-style treatment
+				 * of I/i, but in non-default collations you get exactly what the
+				 * collation says.
+				 */
+				for (p = result; *p; p++)
+				{
+#ifdef HAVE_LOCALE_T
+					if (mylocale)
+					{
+						if (wasalnum)
+							*p = tolower_l((unsigned char) *p, mylocale->info.lt);
+						else
+							*p = toupper_l((unsigned char) *p, mylocale->info.lt);
+						wasalnum = isalnum_l((unsigned char) *p, mylocale->info.lt);
+					}
+					else
+#endif
+					{
+						if (wasalnum)
+							*p = pg_tolower((unsigned char) *p);
+						else
+							*p = pg_toupper((unsigned char) *p);
+						wasalnum = isalnum((unsigned char) *p);
+					}
+				}
 			}
 		}
 	}
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index 8d9d285fb5..1f683ccd0f 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -96,7 +96,7 @@ SB_lower_char(unsigned char c, pg_locale_t locale, bool locale_is_c)
 		return pg_ascii_tolower(c);
 #ifdef HAVE_LOCALE_T
 	else if (locale)
-		return tolower_l(c, locale);
+		return tolower_l(c, locale->info.lt);
 #endif
 	else
 		return pg_tolower(c);
@@ -165,14 +165,36 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 			   *p;
 	int			slen,
 				plen;
+	pg_locale_t locale = 0;
+	bool		locale_is_c = false;
+
+	if (lc_ctype_is_c(collation))
+		locale_is_c = true;
+	else if (collation != DEFAULT_COLLATION_OID)
+	{
+		if (!OidIsValid(collation))
+		{
+			/*
+			 * This typically means that the parser could not resolve a
+			 * conflict of implicit collations, so report it that way.
+			 */
+			ereport(ERROR,
+					(errcode(ERRCODE_INDETERMINATE_COLLATION),
+					 errmsg("could not determine which collation to use for ILIKE"),
+					 errhint("Use the COLLATE clause to set the collation explicitly.")));
+		}
+		locale = pg_newlocale_from_collation(collation);
+	}
 
 	/*
 	 * For efficiency reasons, in the single byte case we don't call lower()
 	 * on the pattern and text, but instead call SB_lower_char on each
-	 * character.  In the multi-byte case we don't have much choice :-(
+	 * character.  In the multi-byte case we don't have much choice :-(.
+	 * Also, ICU does not support single-character case folding, so we go the
+	 * long way.
 	 */
 
-	if (pg_database_encoding_max_length() > 1)
+	if (pg_database_encoding_max_length() > 1 || locale->provider == COLLPROVIDER_ICU)
 	{
 		/* lower's result is never packed, so OK to use old macros here */
 		pat = DatumGetTextPP(DirectFunctionCall1Coll(lower, collation,
@@ -190,31 +212,6 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
 	}
 	else
 	{
-		/*
-		 * Here we need to prepare locale information for SB_lower_char. This
-		 * should match the methods used in str_tolower().
-		 */
-		pg_locale_t locale = 0;
-		bool		locale_is_c = false;
-
-		if (lc_ctype_is_c(collation))
-			locale_is_c = true;
-		else if (collation != DEFAULT_COLLATION_OID)
-		{
-			if (!OidIsValid(collation))
-			{
-				/*
-				 * This typically means that the parser could not resolve a
-				 * conflict of implicit collations, so report it that way.
-				 */
-				ereport(ERROR,
-						(errcode(ERRCODE_INDETERMINATE_COLLATION),
-						 errmsg("could not determine which collation to use for ILIKE"),
-						 errhint("Use the COLLATE clause to set the collation explicitly.")));
-			}
-			locale = pg_newlocale_from_collation(collation);
-		}
-
 		p = VARDATA_ANY(pat);
 		plen = VARSIZE_ANY_EXHDR(pat);
 		s = VARDATA_ANY(str);
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index ab197025f8..2a2c9bc504 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -57,11 +57,17 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_control.h"
 #include "mb/pg_wchar.h"
+#include "utils/builtins.h"
 #include "utils/hsearch.h"
+#include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
 #include "utils/syscache.h"
 
+#ifdef USE_ICU
+#include <unicode/ucnv.h>
+#endif
+
 #ifdef WIN32
 /*
  * This Windows file defines StrNCpy. We don't need it here, so we undefine
@@ -1272,12 +1278,13 @@ pg_newlocale_from_collation(Oid collid)
 	if (cache_entry->locale == 0)
 	{
 		/* We haven't computed this yet in this session, so do it */
-#ifdef HAVE_LOCALE_T
 		HeapTuple	tp;
 		Form_pg_collation collform;
 		const char *collcollate;
-		const char *collctype;
-		locale_t	result;
+		const char *collctype pg_attribute_unused();
+		pg_locale_t	result;
+		Datum		collversion;
+		bool		isnull;
 
 		tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collid));
 		if (!HeapTupleIsValid(tp))
@@ -1287,61 +1294,230 @@ pg_newlocale_from_collation(Oid collid)
 		collcollate = NameStr(collform->collcollate);
 		collctype = NameStr(collform->collctype);
 
-		if (strcmp(collcollate, collctype) == 0)
+		result = malloc(sizeof(* result));
+		memset(result, 0, sizeof(* result));
+		result->provider = collform->collprovider;
+
+		if (collform->collprovider == COLLPROVIDER_LIBC)
 		{
-			/* Normal case where they're the same */
+#ifdef HAVE_LOCALE_T
+			locale_t	loc;
+
+			if (strcmp(collcollate, collctype) == 0)
+			{
+				/* Normal case where they're the same */
 #ifndef WIN32
-			result = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
-							   NULL);
+				loc = newlocale(LC_COLLATE_MASK | LC_CTYPE_MASK, collcollate,
+								   NULL);
 #else
-			result = _create_locale(LC_ALL, collcollate);
+				loc = _create_locale(LC_ALL, collcollate);
 #endif
-			if (!result)
-				report_newlocale_failure(collcollate);
-		}
-		else
-		{
+				if (!loc)
+					report_newlocale_failure(collcollate);
+			}
+			else
+			{
 #ifndef WIN32
-			/* We need two newlocale() steps */
-			locale_t	loc1;
-
-			loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
-			if (!loc1)
-				report_newlocale_failure(collcollate);
-			result = newlocale(LC_CTYPE_MASK, collctype, loc1);
-			if (!result)
-				report_newlocale_failure(collctype);
+				/* We need two newlocale() steps */
+				locale_t	loc1;
+
+				loc1 = newlocale(LC_COLLATE_MASK, collcollate, NULL);
+				if (!loc1)
+					report_newlocale_failure(collcollate);
+				loc = newlocale(LC_CTYPE_MASK, collctype, loc1);
+				if (!loc)
+					report_newlocale_failure(collctype);
 #else
 
-			/*
-			 * XXX The _create_locale() API doesn't appear to support this.
-			 * Could perhaps be worked around by changing pg_locale_t to
-			 * contain two separate fields.
-			 */
+				/*
+				 * XXX The _create_locale() API doesn't appear to support this.
+				 * Could perhaps be worked around by changing pg_locale_t to
+				 * contain two separate fields.
+				 */
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("collations with different collate and ctype values are not supported on this platform")));
+#endif
+			}
+
+			result->info.lt = loc;
+#else							/* not HAVE_LOCALE_T */
+			/* platform that doesn't support locale_t */
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					 errmsg("collations with different collate and ctype values are not supported on this platform")));
-#endif
+					 errmsg("collation provider LIBC is not supported on this platform")));
+#endif   /* not HAVE_LOCALE_T */
+		}
+		else if (collform->collprovider == COLLPROVIDER_ICU)
+		{
+#ifdef USE_ICU
+			UCollator  *collator;
+			UErrorCode	status;
+
+			status = U_ZERO_ERROR;
+			collator = ucol_open(collcollate, &status);
+			if (U_FAILURE(status))
+				ereport(ERROR,
+						(errmsg("could not open collator for locale \"%s\": %s",
+								collcollate, u_errorName(status))));
+
+			result->info.icu.locale = strdup(collcollate);
+			result->info.icu.ucol = collator;
+#else /* not USE_ICU */
+			/* could get here if a collation was created by a build with ICU */
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("ICU is not supported in this build"), \
+					 errhint("You need to rebuild PostgreSQL using --with-icu.")));
+#endif /* not USE_ICU */
 		}
 
-		cache_entry->locale = result;
+		collversion = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collversion,
+									  &isnull);
+		if (!isnull)
+		{
+			char	   *actual_versionstr;
+			char	   *collversionstr;
+
+			actual_versionstr = get_collation_actual_version(collform->collprovider, collcollate);
+			if (!actual_versionstr)
+				/* This could happen when specifying a version in CREATE
+				 * COLLATION for a libc locale, or manually creating a mess
+				 * in the catalogs. */
+				ereport(ERROR,
+						(errmsg("collation \"%s\" has no actual version, but a version was specified",
+								NameStr(collform->collname))));
+			collversionstr = TextDatumGetCString(collversion);
+
+			if (strcmp(actual_versionstr, collversionstr) != 0)
+				ereport(WARNING,
+						(errmsg("collation \"%s\" has version mismatch",
+								NameStr(collform->collname)),
+						 errdetail("The collation in the database was created using version %s, "
+								   "but the operating system provides version %s.",
+								   collversionstr, actual_versionstr),
+						 errhint("Rebuild all objects affected by this collation and run "
+								 "ALTER COLLATION %s REFRESH VERSION, "
+								 "or build PostgreSQL with the right library version.",
+								 quote_qualified_identifier(get_namespace_name(collform->collnamespace),
+															NameStr(collform->collname)))));
+		}
 
 		ReleaseSysCache(tp);
-#else							/* not HAVE_LOCALE_T */
 
-		/*
-		 * For platforms that don't support locale_t, we can't do anything
-		 * with non-default collations.
-		 */
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-		errmsg("nondefault collations are not supported on this platform")));
-#endif   /* not HAVE_LOCALE_T */
+		cache_entry->locale = result;
 	}
 
 	return cache_entry->locale;
 }
 
+/*
+ * Get provider-specific collation version string for the given collation from
+ * the operating system/library.
+ *
+ * A particular provider must always either return a non-NULL string or return
+ * NULL (if it doesn't support versions).  It must not return NULL for some
+ * collcollate and not NULL for others.
+ */
+char *
+get_collation_actual_version(char collprovider, const char *collcollate)
+{
+	char	   *collversion;
+
+#ifdef USE_ICU
+	if (collprovider == COLLPROVIDER_ICU)
+	{
+		UCollator  *collator;
+		UErrorCode	status;
+		UVersionInfo versioninfo;
+		char		buf[U_MAX_VERSION_STRING_LENGTH];
+
+		status = U_ZERO_ERROR;
+		collator = ucol_open(collcollate, &status);
+		if (U_FAILURE(status))
+			ereport(ERROR,
+					(errmsg("could not open collator for locale \"%s\": %s",
+							collcollate, u_errorName(status))));
+		ucol_getVersion(collator, versioninfo);
+		ucol_close(collator);
+
+		u_versionToString(versioninfo, buf);
+		collversion = pstrdup(buf);
+	}
+	else
+#endif
+		collversion = NULL;
+
+	return collversion;
+}
+
+
+#ifdef USE_ICU
+/*
+ * Converter object for converting between ICU's UChar strings and C strings
+ * in database encoding.  Since the database encoding doesn't change, we only
+ * need one of these per session.
+ */
+static UConverter *icu_converter = NULL;
+
+static void
+init_icu_converter(void)
+{
+	const char *icu_encoding_name;
+	UErrorCode	status;
+	UConverter *conv;
+
+	if (icu_converter)
+		return;
+
+	icu_encoding_name = get_encoding_name_for_icu(GetDatabaseEncoding());
+
+	status = U_ZERO_ERROR;
+	conv = ucnv_open(icu_encoding_name, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("could not open ICU converter for encoding \"%s\": %s",
+						icu_encoding_name, u_errorName(status))));
+
+	icu_converter = conv;
+}
+
+int32_t
+icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes)
+{
+	UErrorCode	status;
+	int32_t		len_uchar;
+
+	init_icu_converter();
+
+	len_uchar = 2 * nbytes;  /* max length per docs */
+	*buff_uchar = palloc(len_uchar * sizeof(**buff_uchar));
+	status = U_ZERO_ERROR;
+	len_uchar = ucnv_toUChars(icu_converter, *buff_uchar, len_uchar, buff, nbytes, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_toUChars failed: %s", u_errorName(status))));
+	return len_uchar;
+}
+
+int32_t
+icu_from_uchar(char **result, UChar *buff_uchar, int32_t len_uchar)
+{
+	UErrorCode	status;
+	int32_t		len_result;
+
+	init_icu_converter();
+
+	len_result = UCNV_GET_MAX_BYTES_FOR_STRING(len_uchar, ucnv_getMaxCharSize(icu_converter));
+	*result = palloc(len_result + 1);
+	status = U_ZERO_ERROR;
+	ucnv_fromUChars(icu_converter, *result, len_result, buff_uchar, len_uchar, &status);
+	if (U_FAILURE(status))
+		ereport(ERROR,
+				(errmsg("ucnv_fromUChars failed: %s", u_errorName(status))));
+	return len_result;
+}
+#endif
 
 /*
  * These functions convert from/to libc's wchar_t, *not* pg_wchar_t.
@@ -1362,6 +1538,8 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1398,10 +1576,10 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_WCSTOMBS_L
 		/* Use wcstombs_l for nondefault locales */
-		result = wcstombs_l(to, from, tolen, locale);
+		result = wcstombs_l(to, from, tolen, locale->info.lt);
 #else							/* !HAVE_WCSTOMBS_L */
 		/* We have to temporarily set the locale as current ... ugh */
-		locale_t	save_locale = uselocale(locale);
+		locale_t	save_locale = uselocale(locale->info.lt);
 
 		result = wcstombs(to, from, tolen);
 
@@ -1432,6 +1610,8 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 {
 	size_t		result;
 
+	Assert(!locale || locale->provider == COLLPROVIDER_LIBC);
+
 	if (tolen == 0)
 		return 0;
 
@@ -1473,10 +1653,10 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 #ifdef HAVE_LOCALE_T
 #ifdef HAVE_MBSTOWCS_L
 			/* Use mbstowcs_l for nondefault locales */
-			result = mbstowcs_l(to, str, tolen, locale);
+			result = mbstowcs_l(to, str, tolen, locale->info.lt);
 #else							/* !HAVE_MBSTOWCS_L */
 			/* We have to temporarily set the locale as current ... ugh */
-			locale_t	save_locale = uselocale(locale);
+			locale_t	save_locale = uselocale(locale->info.lt);
 
 			result = mbstowcs(to, str, tolen);
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index bb9a544686..f8b28fe0e6 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5259,7 +5259,7 @@ find_join_input_rel(PlannerInfo *root, Relids relids)
 /*
  * Check whether char is a letter (and, hence, subject to case-folding)
  *
- * In multibyte character sets, we can't use isalpha, and it does not seem
+ * In multibyte character sets or with ICU, we can't use isalpha, and it does not seem
  * worth trying to convert to wchar_t to use iswalpha.  Instead, just assume
  * any multibyte char is potentially case-varying.
  */
@@ -5271,9 +5271,11 @@ pattern_char_isalpha(char c, bool is_multibyte,
 		return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
 	else if (is_multibyte && IS_HIGHBIT_SET(c))
 		return true;
+	else if (locale && locale->provider == COLLPROVIDER_ICU)
+		return IS_HIGHBIT_SET(c) ? true : false;
 #ifdef HAVE_LOCALE_T
-	else if (locale)
-		return isalpha_l((unsigned char) c, locale);
+	else if (locale && locale->provider == COLLPROVIDER_LIBC)
+		return isalpha_l((unsigned char) c, locale->info.lt);
 #endif
 	else
 		return isalpha((unsigned char) c);
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index cd036afc00..aa556aa5de 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -73,9 +73,7 @@ typedef struct
 	hyperLogLogState abbr_card; /* Abbreviated key cardinality state */
 	hyperLogLogState full_card; /* Full key cardinality state */
 	double		prop_card;		/* Required cardinality proportion */
-#ifdef HAVE_LOCALE_T
 	pg_locale_t locale;
-#endif
 } VarStringSortSupport;
 
 /*
@@ -1403,10 +1401,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		char		a2buf[TEXTBUFLEN];
 		char	   *a1p,
 				   *a2p;
-
-#ifdef HAVE_LOCALE_T
 		pg_locale_t mylocale = 0;
-#endif
 
 		if (collid != DEFAULT_COLLATION_OID)
 		{
@@ -1421,9 +1416,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			mylocale = pg_newlocale_from_collation(collid);
-#endif
 		}
 
 		/*
@@ -1542,11 +1535,54 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
 		memcpy(a2p, arg2, len2);
 		a2p[len2] = '\0';
 
-#ifdef HAVE_LOCALE_T
 		if (mylocale)
-			result = strcoll_l(a1p, a2p, mylocale);
-		else
+		{
+			if (mylocale->provider == COLLPROVIDER_ICU)
+			{
+#ifdef USE_ICU
+#ifdef HAVE_UCOL_STRCOLLUTF8
+				if (GetDatabaseEncoding() == PG_UTF8)
+				{
+					UErrorCode	status;
+
+					status = U_ZERO_ERROR;
+					result = ucol_strcollUTF8(mylocale->info.icu.ucol,
+											  arg1, len1,
+											  arg2, len2,
+											  &status);
+					if (U_FAILURE(status))
+						ereport(ERROR,
+								(errmsg("collation failed: %s", u_errorName(status))));
+				}
+				else
+#endif
+				{
+					int32_t ulen1, ulen2;
+					UChar *uchar1, *uchar2;
+
+					ulen1 = icu_to_uchar(&uchar1, arg1, len1);
+					ulen2 = icu_to_uchar(&uchar2, arg2, len2);
+
+					result = ucol_strcoll(mylocale->info.icu.ucol,
+										  uchar1, ulen1,
+										  uchar2, ulen2);
+				}
+#else	/* not USE_ICU */
+				/* shouldn't happen */
+				elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
+#endif	/* not USE_ICU */
+			}
+			else
+			{
+#ifdef HAVE_LOCALE_T
+				result = strcoll_l(a1p, a2p, mylocale->info.lt);
+#else
+				/* shouldn't happen */
+				elog(ERROR, "unsupported collprovider: %c", mylocale->provider);
 #endif
+			}
+		}
+		else
 			result = strcoll(a1p, a2p);
 
 		/*
@@ -1768,10 +1804,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	bool		abbreviate = ssup->abbreviate;
 	bool		collate_c = false;
 	VarStringSortSupport *sss;
-
-#ifdef HAVE_LOCALE_T
 	pg_locale_t locale = 0;
-#endif
 
 	/*
 	 * If possible, set ssup->comparator to a function which can be used to
@@ -1826,9 +1859,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 						 errmsg("could not determine which collation to use for string comparison"),
 						 errhint("Use the COLLATE clause to set the collation explicitly.")));
 			}
-#ifdef HAVE_LOCALE_T
 			locale = pg_newlocale_from_collation(collid);
-#endif
 		}
 	}
 
@@ -1854,7 +1885,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 	 * platforms.
 	 */
 #ifndef TRUST_STRXFRM
-	if (!collate_c)
+	if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
 		abbreviate = false;
 #endif
 
@@ -1877,9 +1908,7 @@ varstr_sortsupport(SortSupport ssup, Oid collid, bool bpchar)
 		sss->last_len2 = -1;
 		/* Initialize */
 		sss->last_returned = 0;
-#ifdef HAVE_LOCALE_T
 		sss->locale = locale;
-#endif
 
 		/*
 		 * To avoid somehow confusing a strxfrm() blob and an original string,
@@ -2090,11 +2119,54 @@ varstrfastcmp_locale(Datum x, Datum y, SortSupport ssup)
 		goto done;
 	}
 
-#ifdef HAVE_LOCALE_T
 	if (sss->locale)
-		result = strcoll_l(sss->buf1, sss->buf2, sss->locale);
-	else
+	{
+		if (sss->locale->provider == COLLPROVIDER_ICU)
+		{
+#ifdef USE_ICU
+#ifdef HAVE_UCOL_STRCOLLUTF8
+			if (GetDatabaseEncoding() == PG_UTF8)
+			{
+				UErrorCode	status;
+
+				status = U_ZERO_ERROR;
+				result = ucol_strcollUTF8(sss->locale->info.icu.ucol,
+										  a1p, len1,
+										  a2p, len2,
+										  &status);
+				if (U_FAILURE(status))
+					ereport(ERROR,
+							(errmsg("collation failed: %s", u_errorName(status))));
+			}
+			else
 #endif
+			{
+				int32_t ulen1, ulen2;
+				UChar *uchar1, *uchar2;
+
+				ulen1 = icu_to_uchar(&uchar1, a1p, len1);
+				ulen2 = icu_to_uchar(&uchar2, a2p, len2);
+
+				result = ucol_strcoll(sss->locale->info.icu.ucol,
+									  uchar1, ulen1,
+									  uchar2, ulen2);
+			}
+#else	/* not USE_ICU */
+			/* shouldn't happen */
+			elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+#endif	/* not USE_ICU */
+		}
+		else
+		{
+#ifdef HAVE_LOCALE_T
+			result = strcoll_l(sss->buf1, sss->buf2, sss->locale->info.lt);
+#else
+			/* shouldn't happen */
+			elog(ERROR, "unsupported collprovider: %c", sss->locale->provider);
+#endif
+		}
+	}
+	else
 		result = strcoll(sss->buf1, sss->buf2);
 
 	/*
@@ -2200,9 +2272,14 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 	else
 	{
 		Size		bsize;
+#ifdef USE_ICU
+		int32_t		ulen = -1;
+		UChar	   *uchar;
+#endif
 
 		/*
-		 * We're not using the C collation, so fall back on strxfrm.
+		 * We're not using the C collation, so fall back on strxfrm or ICU
+		 * analogs.
 		 */
 
 		/* By convention, we use buffer 1 to store and NUL-terminate */
@@ -2222,17 +2299,66 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 			goto done;
 		}
 
-		/* Just like strcoll(), strxfrm() expects a NUL-terminated string */
 		memcpy(sss->buf1, authoritative_data, len);
+		/* Just like strcoll(), strxfrm() expects a NUL-terminated string.
+		 * Not necessary for ICU, but doesn't hurt. */
 		sss->buf1[len] = '\0';
 		sss->last_len1 = len;
 
+#ifdef USE_ICU
+		/* When using ICU and not UTF8, convert string to UChar. */
+		if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU &&
+			GetDatabaseEncoding() != PG_UTF8)
+			ulen = icu_to_uchar(&uchar, sss->buf1, len);
+#endif
+
+		/*
+		 * Loop: Call strxfrm() or ucol_getSortKey(), possibly enlarge buffer,
+		 * and try again.  Both of these functions have the result buffer
+		 * content undefined if the result did not fit, so we need to retry
+		 * until everything fits, even though we only need the first few bytes
+		 * in the end.  When using ucol_nextSortKeyPart(), however, we only
+		 * ask for as many bytes as we actually need.
+		 */
 		for (;;)
 		{
+#ifdef USE_ICU
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_ICU)
+			{
+				/*
+				 * When using UTF8, use the iteration interface so we only
+				 * need to produce as many bytes as we actually need.
+				 */
+				if (GetDatabaseEncoding() == PG_UTF8)
+				{
+					UCharIterator iter;
+					uint32_t	state[2];
+					UErrorCode	status;
+
+					uiter_setUTF8(&iter, sss->buf1, len);
+					state[0] = state[1] = 0;  /* won't need that again */
+					status = U_ZERO_ERROR;
+					bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol,
+												 &iter,
+												 state,
+												 (uint8_t *) sss->buf2,
+												 Min(sizeof(Datum), sss->buflen2),
+												 &status);
+					if (U_FAILURE(status))
+						ereport(ERROR,
+								(errmsg("sort key generation failed: %s", u_errorName(status))));
+				}
+				else
+					bsize = ucol_getSortKey(sss->locale->info.icu.ucol,
+											uchar, ulen,
+											(uint8_t *) sss->buf2, sss->buflen2);
+			}
+			else
+#endif
 #ifdef HAVE_LOCALE_T
-			if (sss->locale)
+			if (sss->locale && sss->locale->provider == COLLPROVIDER_LIBC)
 				bsize = strxfrm_l(sss->buf2, sss->buf1,
-								  sss->buflen2, sss->locale);
+								  sss->buflen2, sss->locale->info.lt);
 			else
 #endif
 				bsize = strxfrm(sss->buf2, sss->buf1, sss->buflen2);
@@ -2242,8 +2368,7 @@ varstr_abbrev_convert(Datum original, SortSupport ssup)
 				break;
 
 			/*
-			 * The C standard states that the contents of the buffer is now
-			 * unspecified.  Grow buffer, and retry.
+			 * Grow buffer and retry.
 			 */
 			pfree(sss->buf2);
 			sss->buflen2 = Max(bsize + 1,
diff --git a/src/backend/utils/mb/encnames.c b/src/backend/utils/mb/encnames.c
index 11099b844f..444eec25b5 100644
--- a/src/backend/utils/mb/encnames.c
+++ b/src/backend/utils/mb/encnames.c
@@ -403,6 +403,82 @@ const pg_enc2gettext pg_enc2gettext_tbl[] =
 };
 
 
+#ifndef FRONTEND
+
+/*
+ * Table of encoding names for ICU
+ *
+ * Reference: <https://ssl.icu-project.org/icu-bin/convexp>
+ *
+ * NULL entries are not supported by ICU, or their mapping is unclear.
+ */
+static const char * const pg_enc2icu_tbl[] =
+{
+	NULL,					/* PG_SQL_ASCII */
+	"EUC-JP",				/* PG_EUC_JP */
+	"EUC-CN",				/* PG_EUC_CN */
+	"EUC-KR",				/* PG_EUC_KR */
+	"EUC-TW",				/* PG_EUC_TW */
+	NULL,					/* PG_EUC_JIS_2004 */
+	"UTF-8",				/* PG_UTF8 */
+	NULL,					/* PG_MULE_INTERNAL */
+	"ISO-8859-1",			/* PG_LATIN1 */
+	"ISO-8859-2",			/* PG_LATIN2 */
+	"ISO-8859-3",			/* PG_LATIN3 */
+	"ISO-8859-4",			/* PG_LATIN4 */
+	"ISO-8859-9",			/* PG_LATIN5 */
+	"ISO-8859-10",			/* PG_LATIN6 */
+	"ISO-8859-13",			/* PG_LATIN7 */
+	"ISO-8859-14",			/* PG_LATIN8 */
+	"ISO-8859-15",			/* PG_LATIN9 */
+	NULL,					/* PG_LATIN10 */
+	"CP1256",				/* PG_WIN1256 */
+	"CP1258",				/* PG_WIN1258 */
+	"CP866",				/* PG_WIN866 */
+	NULL,					/* PG_WIN874 */
+	"KOI8-R",				/* PG_KOI8R */
+	"CP1251",				/* PG_WIN1251 */
+	"CP1252",				/* PG_WIN1252 */
+	"ISO-8859-5",			/* PG_ISO_8859_5 */
+	"ISO-8859-6",			/* PG_ISO_8859_6 */
+	"ISO-8859-7",			/* PG_ISO_8859_7 */
+	"ISO-8859-8",			/* PG_ISO_8859_8 */
+	"CP1250",				/* PG_WIN1250 */
+	"CP1253",				/* PG_WIN1253 */
+	"CP1254",				/* PG_WIN1254 */
+	"CP1255",				/* PG_WIN1255 */
+	"CP1257",				/* PG_WIN1257 */
+	"KOI8-U",				/* PG_KOI8U */
+};
+
+bool
+is_encoding_supported_by_icu(int encoding)
+{
+	return (pg_enc2icu_tbl[encoding] != NULL);
+}
+
+const char *
+get_encoding_name_for_icu(int encoding)
+{
+	const char *icu_encoding_name;
+
+	StaticAssertStmt(lengthof(pg_enc2icu_tbl) == PG_ENCODING_BE_LAST + 1,
+					 "pg_enc2icu_tbl incomplete");
+
+	icu_encoding_name = pg_enc2icu_tbl[encoding];
+
+	if (!icu_encoding_name)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("encoding \"%s\" not supported by ICU",
+						pg_encoding_to_char(encoding))));
+
+	return icu_encoding_name;
+}
+
+#endif /* not FRONTEND */
+
+
 /* ----------
  * Encoding checks, for error returns -1 else encoding id
  * ----------
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index e0c72fbb80..8dde1e8f9d 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -62,6 +62,7 @@
 #include "catalog/catalog.h"
 #include "catalog/pg_authid.h"
 #include "catalog/pg_class.h"
+#include "catalog/pg_collation.h"
 #include "common/file_utils.h"
 #include "common/restricted_token.h"
 #include "common/username.h"
@@ -1629,7 +1630,7 @@ setup_collation(FILE *cmdfd)
 	PG_CMD_PUTS("SELECT pg_import_system_collations(if_not_exists => false, schema => 'pg_catalog');\n\n");
 
 	/* Add an SQL-standard name */
-	PG_CMD_PRINTF2("INSERT INTO pg_collation (collname, collnamespace, collowner, collencoding, collcollate, collctype) VALUES ('ucs_basic', 'pg_catalog'::regnamespace, %u, %d, 'C', 'C');\n\n", BOOTSTRAP_SUPERUSERID, PG_UTF8);
+	PG_CMD_PRINTF3("INSERT INTO pg_collation (collname, collnamespace, collowner, collprovider, collencoding, collcollate, collctype) VALUES ('ucs_basic', 'pg_catalog'::regnamespace, %u, '%c', %d, 'C', 'C');\n\n", BOOTSTRAP_SUPERUSERID, COLLPROVIDER_LIBC, PG_UTF8);
 }
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e67171dccb..54b4a1fd16 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -12806,8 +12806,10 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	PQExpBuffer delq;
 	PQExpBuffer labelq;
 	PGresult   *res;
+	int			i_collprovider;
 	int			i_collcollate;
 	int			i_collctype;
+	const char *collprovider;
 	const char *collcollate;
 	const char *collctype;
 
@@ -12824,18 +12826,32 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	selectSourceSchema(fout, collinfo->dobj.namespace->dobj.name);
 
 	/* Get collation-specific details */
-	appendPQExpBuffer(query, "SELECT "
-					  "collcollate, "
-					  "collctype "
-					  "FROM pg_catalog.pg_collation c "
-					  "WHERE c.oid = '%u'::pg_catalog.oid",
-					  collinfo->dobj.catId.oid);
+	if (fout->remoteVersion >= 100000)
+		appendPQExpBuffer(query, "SELECT "
+						  "collprovider, "
+						  "collcollate, "
+						  "collctype, "
+						  "collversion "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
+	else
+		appendPQExpBuffer(query, "SELECT "
+						  "'p'::char AS collprovider, "
+						  "collcollate, "
+						  "collctype, "
+						  "NULL AS collversion "
+						  "FROM pg_catalog.pg_collation c "
+						  "WHERE c.oid = '%u'::pg_catalog.oid",
+						  collinfo->dobj.catId.oid);
 
 	res = ExecuteSqlQueryForSingleRow(fout, query->data);
 
+	i_collprovider = PQfnumber(res, "collprovider");
 	i_collcollate = PQfnumber(res, "collcollate");
 	i_collctype = PQfnumber(res, "collctype");
 
+	collprovider = PQgetvalue(res, 0, i_collprovider);
 	collcollate = PQgetvalue(res, 0, i_collcollate);
 	collctype = PQgetvalue(res, 0, i_collctype);
 
@@ -12847,11 +12863,50 @@ dumpCollation(Archive *fout, CollInfo *collinfo)
 	appendPQExpBuffer(delq, ".%s;\n",
 					  fmtId(collinfo->dobj.name));
 
-	appendPQExpBuffer(q, "CREATE COLLATION %s (lc_collate = ",
+	appendPQExpBuffer(q, "CREATE COLLATION %s (",
 					  fmtId(collinfo->dobj.name));
-	appendStringLiteralAH(q, collcollate, fout);
-	appendPQExpBufferStr(q, ", lc_ctype = ");
-	appendStringLiteralAH(q, collctype, fout);
+
+	appendPQExpBufferStr(q, "provider = ");
+	if (collprovider[0] == 'c')
+		appendPQExpBufferStr(q, "libc");
+	else if (collprovider[0] == 'i')
+		appendPQExpBufferStr(q, "icu");
+	else
+		exit_horribly(NULL,
+					  "unrecognized collation provider: %s\n",
+					  collprovider);
+
+	if (strcmp(collcollate, collctype) == 0)
+	{
+		appendPQExpBufferStr(q, ", locale = ");
+		appendStringLiteralAH(q, collcollate, fout);
+	}
+	else
+	{
+		appendPQExpBufferStr(q, ", lc_collate = ");
+		appendStringLiteralAH(q, collcollate, fout);
+		appendPQExpBufferStr(q, ", lc_ctype = ");
+		appendStringLiteralAH(q, collctype, fout);
+	}
+
+	/*
+	 * For binary upgrade, carry over the collation version.  For normal
+	 * dump/restore, omit the version, so that it is computed upon restore.
+	 */
+	if (dopt->binary_upgrade)
+	{
+		int			i_collversion;
+
+		i_collversion = PQfnumber(res, "collversion");
+		if (!PQgetisnull(res, 0, i_collversion))
+		{
+			appendPQExpBufferStr(q, ", version = ");
+			appendStringLiteralAH(q,
+								  PQgetvalue(res, 0, i_collversion),
+								  fout);
+		}
+	}
+
 	appendPQExpBufferStr(q, ");\n");
 
 	appendPQExpBuffer(labelq, "COLLATION %s", fmtId(collinfo->dobj.name));
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index a4e260a4e4..05c681ad7a 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -2424,7 +2424,7 @@
 		  'CREATE COLLATION test0 FROM "C";',
 		regexp =>
 		  qr/^
-		  \QCREATE COLLATION test0 (lc_collate = 'C', lc_ctype = 'C');\E/xm,
+		  \QCREATE COLLATION test0 (provider = libc, locale = 'C');\E/xm,
 	    collation => 1,
 		like => {
 			binary_upgrade           => 1,
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 61a3e2a848..8c583127fd 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3738,7 +3738,7 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, false, false};
 
 	if (pset.sversion < 90100)
 	{
@@ -3762,6 +3762,11 @@ listCollations(const char *pattern, bool verbose, bool showSystem)
 					  gettext_noop("Collate"),
 					  gettext_noop("Ctype"));
 
+	if (pset.sversion >= 100000)
+		appendPQExpBuffer(&buf,
+						  ",\n       CASE c.collprovider WHEN 'd' THEN 'default' WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\"",
+						  gettext_noop("Provider"));
+
 	if (verbose)
 		appendPQExpBuffer(&buf,
 						  ",\n       pg_catalog.obj_description(c.oid, 'pg_collation') AS \"%s\"",
diff --git a/src/include/catalog/pg_collation.h b/src/include/catalog/pg_collation.h
index 30c87e004e..8edd8aa066 100644
--- a/src/include/catalog/pg_collation.h
+++ b/src/include/catalog/pg_collation.h
@@ -34,9 +34,13 @@ CATALOG(pg_collation,3456)
 	NameData	collname;		/* collation name */
 	Oid			collnamespace;	/* OID of namespace containing collation */
 	Oid			collowner;		/* owner of collation */
+	char		collprovider;	/* see constants below */
 	int32		collencoding;	/* encoding for this collation; -1 = "all" */
 	NameData	collcollate;	/* LC_COLLATE setting */
 	NameData	collctype;		/* LC_CTYPE setting */
+#ifdef CATALOG_VARLEN			/* variable-length fields start here */
+	text		collversion;	/* provider-dependent version of collation data */
+#endif
 } FormData_pg_collation;
 
 /* ----------------
@@ -50,27 +54,34 @@ typedef FormData_pg_collation *Form_pg_collation;
  *		compiler constants for pg_collation
  * ----------------
  */
-#define Natts_pg_collation				6
+#define Natts_pg_collation				8
 #define Anum_pg_collation_collname		1
 #define Anum_pg_collation_collnamespace 2
 #define Anum_pg_collation_collowner		3
-#define Anum_pg_collation_collencoding	4
-#define Anum_pg_collation_collcollate	5
-#define Anum_pg_collation_collctype		6
+#define Anum_pg_collation_collprovider	4
+#define Anum_pg_collation_collencoding	5
+#define Anum_pg_collation_collcollate	6
+#define Anum_pg_collation_collctype		7
+#define Anum_pg_collation_collversion	8
 
 /* ----------------
  *		initial contents of pg_collation
  * ----------------
  */
 
-DATA(insert OID = 100 ( default		PGNSP PGUID -1 "" "" ));
+DATA(insert OID = 100 ( default		PGNSP PGUID d -1 "" "" 0 ));
 DESCR("database's default collation");
 #define DEFAULT_COLLATION_OID	100
-DATA(insert OID = 950 ( C			PGNSP PGUID -1 "C" "C" ));
+DATA(insert OID = 950 ( C			PGNSP PGUID c -1 "C" "C" 0 ));
 DESCR("standard C collation");
 #define C_COLLATION_OID			950
-DATA(insert OID = 951 ( POSIX		PGNSP PGUID -1 "POSIX" "POSIX" ));
+DATA(insert OID = 951 ( POSIX		PGNSP PGUID c -1 "POSIX" "POSIX" 0 ));
 DESCR("standard POSIX collation");
 #define POSIX_COLLATION_OID		951
 
+
+#define COLLPROVIDER_DEFAULT	'd'
+#define COLLPROVIDER_ICU		'i'
+#define COLLPROVIDER_LIBC		'c'
+
 #endif   /* PG_COLLATION_H */
diff --git a/src/include/catalog/pg_collation_fn.h b/src/include/catalog/pg_collation_fn.h
index 482ba7920e..dfebdbaa0b 100644
--- a/src/include/catalog/pg_collation_fn.h
+++ b/src/include/catalog/pg_collation_fn.h
@@ -16,8 +16,10 @@
 
 extern Oid CollationCreate(const char *collname, Oid collnamespace,
 				Oid collowner,
+				char collprovider,
 				int32 collencoding,
 				const char *collcollate, const char *collctype,
+				const char *collversion,
 				bool if_not_exists);
 extern void RemoveCollationById(Oid collationOid);
 
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 836d6ff0b2..b92bfb9655 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -5398,6 +5398,9 @@ DESCR("pg_controldata init state information as a function");
 DATA(insert OID = 3445 ( pg_import_system_collations PGNSP PGUID 12 100 0 0 0 f f f f t f v r 2 0 2278 "16 4089" _null_ _null_ "{if_not_exists,schema}" _null_ _null_ pg_import_system_collations _null_ _null_ _null_ ));
 DESCR("import collations from operating system");
 
+DATA(insert OID = 3448 ( pg_collation_actual_version PGNSP PGUID 12 100 0 0 0 f f f f t f v s 1 0 25 "26" _null_ _null_ _null_ _null_ _null_ pg_collation_actual_version _null_ _null_ _null_ ));
+DESCR("import collations from operating system");
+
 /* system management/monitoring related functions */
 DATA(insert OID = 3353 (  pg_ls_logdir               PGNSP PGUID 12 10 20 0 0 f f f f t t v s 0 0 2249 "" "{25,20,1184}" "{o,o,o}" "{name,size,modification}" _null_ _null_ pg_ls_logdir _null_ _null_ _null_ ));
 DESCR("list files in the log directory");
diff --git a/src/include/commands/collationcmds.h b/src/include/commands/collationcmds.h
index 3b2fcb8271..df5623ccb6 100644
--- a/src/include/commands/collationcmds.h
+++ b/src/include/commands/collationcmds.h
@@ -20,5 +20,6 @@
 
 extern ObjectAddress DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_exists);
 extern void IsThereCollationInNamespace(const char *collname, Oid nspOid);
+extern ObjectAddress AlterCollation(AlterCollationStmt *stmt);
 
 #endif   /* COLLATIONCMDS_H */
diff --git a/src/include/mb/pg_wchar.h b/src/include/mb/pg_wchar.h
index 5f54697302..9c5e749c9e 100644
--- a/src/include/mb/pg_wchar.h
+++ b/src/include/mb/pg_wchar.h
@@ -333,6 +333,12 @@ typedef struct pg_enc2gettext
 extern const pg_enc2gettext pg_enc2gettext_tbl[];
 
 /*
+ * Encoding names for ICU
+ */
+extern bool is_encoding_supported_by_icu(int encoding);
+extern const char *get_encoding_name_for_icu(int encoding);
+
+/*
  * pg_wchar stuff
  */
 typedef int (*mb2wchar_with_len_converter) (const unsigned char *from,
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 2bc7a5df11..5d6d05385c 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -423,6 +423,7 @@ typedef enum NodeTag
 	T_CreateSubscriptionStmt,
 	T_AlterSubscriptionStmt,
 	T_DropSubscriptionStmt,
+	T_AlterCollationStmt,
 
 	/*
 	 * TAGS FOR PARSE TREE NODES (parsenodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index d576523f6a..abc4ae5487 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1733,6 +1733,17 @@ typedef struct AlterTableCmd	/* one subcommand of an ALTER TABLE */
 
 
 /* ----------------------
+ * Alter Collation
+ * ----------------------
+ */
+typedef struct AlterCollationStmt
+{
+	NodeTag		type;
+	List	   *collname;
+} AlterCollationStmt;
+
+
+/* ----------------------
  *	Alter Domain
  *
  * The fields are used in different ways by the different variants of
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 5bcd8a1160..0ca708bbdf 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -603,6 +603,9 @@
 /* Define to 1 if you have the external array `tzname'. */
 #undef HAVE_TZNAME
 
+/* Define to 1 if you have the `ucol_strcollUTF8' function. */
+#undef HAVE_UCOL_STRCOLLUTF8
+
 /* Define to 1 if you have the <ucred.h> header file. */
 #undef HAVE_UCRED_H
 
@@ -816,6 +819,9 @@
    (--enable-float8-byval) */
 #undef USE_FLOAT8_BYVAL
 
+/* Define to build with ICU support. (--with-icu) */
+#undef USE_ICU
+
 /* Define to 1 to build with LDAP support. (--with-ldap) */
 #undef USE_LDAP
 
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index cb509e2b6b..12d7547413 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -15,6 +15,9 @@
 #if defined(LOCALE_T_IN_XLOCALE) || defined(WCSTOMBS_L_IN_XLOCALE)
 #include <xlocale.h>
 #endif
+#ifdef USE_ICU
+#include <unicode/ucol.h>
+#endif
 
 #include "utils/guc.h"
 
@@ -61,17 +64,36 @@ extern void cache_locale_time(void);
  * We define our own wrapper around locale_t so we can keep the same
  * function signatures for all builds, while not having to create a
  * fake version of the standard type locale_t in the global namespace.
- * The fake version of pg_locale_t can be checked for truth; that's
- * about all it will be needed for.
+ * pg_locale_t is occasionally checked for truth, so make it a pointer.
  */
+struct pg_locale_t
+{
+	char	provider;
+	union
+	{
 #ifdef HAVE_LOCALE_T
-typedef locale_t pg_locale_t;
-#else
-typedef int pg_locale_t;
+		locale_t lt;
+#endif
+#ifdef USE_ICU
+		struct {
+			const char *locale;
+			UCollator *ucol;
+		} icu;
 #endif
+	} info;
+};
+
+typedef struct pg_locale_t *pg_locale_t;
 
 extern pg_locale_t pg_newlocale_from_collation(Oid collid);
 
+extern char *get_collation_actual_version(char collprovider, const char *collcollate);
+
+#ifdef USE_ICU
+extern int32_t icu_to_uchar(UChar **buff_uchar, const char *buff, size_t nbytes);
+extern int32_t icu_from_uchar(char **result, UChar *buff_uchar, int32_t len_uchar);
+#endif
+
 /* These functions convert from/to libc's wchar_t, *not* pg_wchar_t */
 #ifdef USE_WIDE_UPPER_LOWER
 extern size_t wchar2char(char *to, const wchar_t *from, size_t tolen,
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index b923ea1420..a747facb9a 100644
--- a/src/test/regress/GNUmakefile
+++ b/src/test/regress/GNUmakefile
@@ -125,6 +125,9 @@ tablespace-setup:
 ##
 
 REGRESS_OPTS = --dlpath=. $(EXTRA_REGRESS_OPTS)
+ifeq ($(with_icu),yes)
+override EXTRA_TESTS := collate.icu $(EXTRA_TESTS)
+endif
 
 check: all tablespace-setup
 	$(pg_regress_check) $(REGRESS_OPTS) --schedule=$(srcdir)/parallel_schedule $(MAXCONNOPT) $(EXTRA_TESTS)
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.icu.out
similarity index 80%
copy from src/test/regress/expected/collate.linux.utf8.out
copy to src/test/regress/expected/collate.icu.out
index 293e78641e..e1fc9984f2 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.icu.out
@@ -1,54 +1,54 @@
 /*
- * This test is for Linux/glibc systems and assumes that a full set of
- * locales is installed.  It must be run in a database with UTF-8 encoding,
- * because other encodings don't support all the characters used.
+ * This test is for ICU collations.
  */
 SET client_encoding TO UTF8;
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
 CREATE TABLE collate_test1 (
     a int,
-    b text COLLATE "en_US" NOT NULL
+    b text COLLATE "en-x-icu" NOT NULL
 );
 \d collate_test1
-           Table "public.collate_test1"
+        Table "collate_tests.collate_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
- b      | text    | en_US     | not null | 
+ b      | text    | en-x-icu  | not null | 
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "ja_JP.eucjp"
+    b text COLLATE "ja_JP.eucjp-x-icu"
 );
-ERROR:  collation "ja_JP.eucjp" for encoding "UTF8" does not exist
-LINE 3:     b text COLLATE "ja_JP.eucjp"
+ERROR:  collation "ja_JP.eucjp-x-icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "ja_JP.eucjp-x-icu"
                    ^
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "foo"
+    b text COLLATE "foo-x-icu"
 );
-ERROR:  collation "foo" for encoding "UTF8" does not exist
-LINE 3:     b text COLLATE "foo"
+ERROR:  collation "foo-x-icu" for encoding "UTF8" does not exist
+LINE 3:     b text COLLATE "foo-x-icu"
                    ^
 CREATE TABLE collate_test_fail (
-    a int COLLATE "en_US",
+    a int COLLATE "en-x-icu",
     b text
 );
 ERROR:  collations are not supported by type integer
-LINE 2:     a int COLLATE "en_US",
+LINE 2:     a int COLLATE "en-x-icu",
                   ^
 CREATE TABLE collate_test_like (
     LIKE collate_test1
 );
 \d collate_test_like
-         Table "public.collate_test_like"
+      Table "collate_tests.collate_test_like"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
- b      | text    | en_US     | not null | 
+ b      | text    | en-x-icu  | not null | 
 
 CREATE TABLE collate_test2 (
     a int,
-    b text COLLATE "sv_SE"
+    b text COLLATE "sv-x-icu"
 );
 CREATE TABLE collate_test3 (
     a int,
@@ -106,12 +106,12 @@ SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
  3 | bbc
 (2 rows)
 
-SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US";
-ERROR:  collation mismatch between explicit collations "C" and "en_US"
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en-x-icu";
+ERROR:  collation mismatch between explicit collations "C" and "en-x-icu"
 LINE 1: ...* FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "e...
                                                              ^
-CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE";
-CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE"; -- fails
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv-x-icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv-x-icu"; -- fails
 ERROR:  collations are not supported by type integer
 CREATE TABLE collate_test4 (
     a int,
@@ -129,7 +129,7 @@ SELECT a, b FROM collate_test4 ORDER BY b;
 
 CREATE TABLE collate_test5 (
     a int,
-    b testdomain_sv COLLATE "en_US"
+    b testdomain_sv COLLATE "en-x-icu"
 );
 INSERT INTO collate_test5 SELECT * FROM collate_test1;
 SELECT a, b FROM collate_test5 ORDER BY b;
@@ -206,13 +206,13 @@ SELECT * FROM collate_test3 ORDER BY b;
 (4 rows)
 
 -- constant expression folding
-SELECT 'bbc' COLLATE "en_US" > 'äbc' COLLATE "en_US" AS "true";
+SELECT 'bbc' COLLATE "en-x-icu" > 'äbc' COLLATE "en-x-icu" AS "true";
  true 
 ------
  t
 (1 row)
 
-SELECT 'bbc' COLLATE "sv_SE" > 'äbc' COLLATE "sv_SE" AS "false";
+SELECT 'bbc' COLLATE "sv-x-icu" > 'äbc' COLLATE "sv-x-icu" AS "false";
  false 
 -------
  f
@@ -221,8 +221,8 @@ SELECT 'bbc' COLLATE "sv_SE" > 'äbc' COLLATE "sv_SE" AS "false";
 -- upper/lower
 CREATE TABLE collate_test10 (
     a int,
-    x text COLLATE "en_US",
-    y text COLLATE "tr_TR"
+    x text COLLATE "en-x-icu",
+    y text COLLATE "tr-x-icu"
 );
 INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
 SELECT a, lower(x), lower(y), upper(x), upper(y), initcap(x), initcap(y) FROM collate_test10;
@@ -290,25 +290,25 @@ SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
  4 | ABC
 (4 rows)
 
-SELECT 'Türkiye' COLLATE "en_US" ILIKE '%KI%' AS "true";
+SELECT 'Türkiye' COLLATE "en-x-icu" ILIKE '%KI%' AS "true";
  true 
 ------
  t
 (1 row)
 
-SELECT 'Türkiye' COLLATE "tr_TR" ILIKE '%KI%' AS "false";
+SELECT 'Türkiye' COLLATE "tr-x-icu" ILIKE '%KI%' AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ILIKE 'BIT' COLLATE "en_US" AS "false";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "en-x-icu" AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ILIKE 'BIT' COLLATE "tr_TR" AS "true";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "tr-x-icu" AS "true";
  true 
 ------
  t
@@ -364,50 +364,75 @@ SELECT * FROM collate_test1 WHERE b ~* 'bc';
  4 | ABC
 (4 rows)
 
-SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en-x-icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+  b  | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space 
+-----+----------+----------+----------+----------+----------+----------+----------+----------+----------
+ abc | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ABC | t        | t        | f        | f        | t        | t        | t        | f        | f
+ 123 | f        | f        | f        | t        | t        | t        | t        | f        | f
+ ab1 | f        | f        | f        | f        | t        | t        | t        | f        | f
+ a1! | f        | f        | f        | f        | f        | t        | t        | f        | f
+ a c | f        | f        | f        | f        | f        | f        | t        | f        | f
+ !.; | f        | f        | f        | f        | f        | t        | t        | t        | f
+     | f        | f        | f        | f        | f        | f        | t        | f        | t
+ äbç | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ÄBÇ | t        | t        | f        | f        | t        | t        | t        | f        | f
+(10 rows)
+
+SELECT 'Türkiye' COLLATE "en-x-icu" ~* 'KI' AS "true";
  true 
 ------
  t
 (1 row)
 
-SELECT 'Türkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
+SELECT 'Türkiye' COLLATE "tr-x-icu" ~* 'KI' AS "true";  -- true with ICU
+ true 
+------
+ t
+(1 row)
+
+SELECT 'bıt' ~* 'BIT' COLLATE "en-x-icu" AS "false";
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ~* 'BIT' COLLATE "en_US" AS "false";
+SELECT 'bıt' ~* 'BIT' COLLATE "tr-x-icu" AS "false";  -- false with ICU
  false 
 -------
  f
 (1 row)
 
-SELECT 'bıt' ~* 'BIT' COLLATE "tr_TR" AS "true";
- true 
-------
- t
-(1 row)
-
 -- The following actually exercises the selectivity estimation for ~*.
 SELECT relname FROM pg_class WHERE relname ~* '^abc';
  relname 
 ---------
 (0 rows)
 
+/* not run by default because it requires tr_TR system locale
 -- to_char
+
 SET lc_time TO 'tr_TR';
 SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
-   to_char   
--------------
- 01 NIS 2010
-(1 row)
-
-SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
-   to_char   
--------------
- 01 NİS 2010
-(1 row)
-
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr-x-icu");
+*/
 -- backwards parsing
 CREATE VIEW collview1 AS SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
 CREATE VIEW collview2 AS SELECT a, b FROM collate_test1 ORDER BY b COLLATE "C";
@@ -693,7 +718,7 @@ SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test3; -- ok
 (8 rows)
 
 SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en-x-icu" and "C"
 LINE 1: SELECT a, b FROM collate_test1 UNION SELECT a, b FROM collat...
                                                        ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
@@ -707,12 +732,12 @@ SELECT a, b COLLATE "C" FROM collate_test1 UNION SELECT a, b FROM collate_test3
 (4 rows)
 
 SELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en-x-icu" and "C"
 LINE 1: ...ELECT a, b FROM collate_test1 INTERSECT SELECT a, b FROM col...
                                                              ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
 SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM collate_test3 ORDER BY 2; -- fail
-ERROR:  collation mismatch between implicit collations "en_US" and "C"
+ERROR:  collation mismatch between implicit collations "en-x-icu" and "C"
 LINE 1: SELECT a, b FROM collate_test1 EXCEPT SELECT a, b FROM colla...
                                                         ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
@@ -731,18 +756,18 @@ select x || y from collate_test10; -- ok, because || is not collation aware
 (2 rows)
 
 select x, y from collate_test10 order by x || y; -- not so ok
-ERROR:  collation mismatch between implicit collations "en_US" and "tr_TR"
+ERROR:  collation mismatch between implicit collations "en-x-icu" and "tr-x-icu"
 LINE 1: select x, y from collate_test10 order by x || y;
                                                       ^
 HINT:  You can choose the collation by applying the COLLATE clause to one or both expressions.
 -- collation mismatch between recursive and non-recursive term
 WITH RECURSIVE foo(x) AS
-   (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+   (SELECT x FROM (VALUES('a' COLLATE "en-x-icu"),('b')) t(x)
    UNION ALL
-   SELECT (x || 'c') COLLATE "de_DE" FROM foo WHERE length(x) < 10)
+   SELECT (x || 'c') COLLATE "de-x-icu" FROM foo WHERE length(x) < 10)
 SELECT * FROM foo;
-ERROR:  recursive query "foo" column 1 has collation "en_US" in non-recursive term but collation "de_DE" overall
-LINE 2:    (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+ERROR:  recursive query "foo" column 1 has collation "en-x-icu" in non-recursive term but collation "de-x-icu" overall
+LINE 2:    (SELECT x FROM (VALUES('a' COLLATE "en-x-icu"),('b')) t(x...
                    ^
 HINT:  Use the COLLATE clause to set the collation of the non-recursive term.
 -- casting
@@ -843,7 +868,7 @@ begin
   return xx < yy;
 end
 $$;
-SELECT mylt2('a', 'B' collate "en_US") as t, mylt2('a', 'B' collate "C") as f;
+SELECT mylt2('a', 'B' collate "en-x-icu") as t, mylt2('a', 'B' collate "C") as f;
  t | f 
 ---+---
  t | f
@@ -957,29 +982,23 @@ CREATE SCHEMA test_schema;
 -- We need to do this this way to cope with varying names for encodings:
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test0 (locale = ' ||
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
           quote_literal(current_setting('lc_collate')) || ');';
 END
 $$;
 CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
-ERROR:  collation "test0" for encoding "UTF8" already exists
-CREATE COLLATION IF NOT EXISTS test0 FROM "C"; -- ok, skipped
-NOTICE:  collation "test0" for encoding "UTF8" already exists, skipping
-CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
-NOTICE:  collation "test0" for encoding "UTF8" already exists, skipping
+ERROR:  collation "test0" already exists
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
           quote_literal(current_setting('lc_collate')) ||
           ', lc_ctype = ' ||
           quote_literal(current_setting('lc_ctype')) || ');';
 END
 $$;
-CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
 ERROR:  parameter "lc_ctype" must be specified
-CREATE COLLATION testx (locale = 'nonsense'); -- fail
-ERROR:  could not create locale "nonsense": No such file or directory
-DETAIL:  The operating system could not find any locale data for the locale name "nonsense".
+CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails with ICU */  DROP COLLATION testx;
 CREATE COLLATION test4 FROM nonsense;
 ERROR:  collation "nonsense" for encoding "UTF8" does not exist
 CREATE COLLATION test5 FROM test0;
@@ -993,7 +1012,7 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
 
 ALTER COLLATION test1 RENAME TO test11;
 ALTER COLLATION test0 RENAME TO test11; -- fail
-ERROR:  collation "test11" for encoding "UTF8" already exists in schema "public"
+ERROR:  collation "test11" already exists in schema "collate_tests"
 ALTER COLLATION test1 RENAME TO test22; -- fail
 ERROR:  collation "test1" for encoding "UTF8" does not exist
 ALTER COLLATION test11 OWNER TO regress_test_role;
@@ -1005,11 +1024,11 @@ SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
     FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
     WHERE collname LIKE 'test%'
     ORDER BY 1;
- collname |   nspname   | obj_description 
-----------+-------------+-----------------
- test0    | public      | US English
- test11   | test_schema | 
- test5    | public      | 
+ collname |    nspname    | obj_description 
+----------+---------------+-----------------
+ test0    | collate_tests | US English
+ test11   | test_schema   | 
+ test5    | collate_tests | 
 (3 rows)
 
 DROP COLLATION test0, test_schema.test11, test5;
@@ -1024,6 +1043,9 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
 
 DROP SCHEMA test_schema;
 DROP ROLE regress_test_role;
+-- ALTER
+ALTER COLLATION "en-x-icu" REFRESH VERSION;
+NOTICE:  version has not changed
 -- dependencies
 CREATE COLLATION test0 FROM "C";
 CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
@@ -1048,13 +1070,13 @@ drop cascades to composite type collate_dep_test2 column y
 drop cascades to view collate_dep_test3
 drop cascades to index collate_dep_test4i
 \d collate_dep_test1
-         Table "public.collate_dep_test1"
+      Table "collate_tests.collate_dep_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
 
 \d collate_dep_test2
-     Composite type "public.collate_dep_test2"
+ Composite type "collate_tests.collate_dep_test2"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  x      | integer |           |          | 
@@ -1063,7 +1085,7 @@ DROP TABLE collate_dep_test1, collate_dep_test4t;
 DROP TYPE collate_dep_test2;
 -- test range types and collations
 create type textrange_c as range(subtype=text, collation="C");
-create type textrange_en_us as range(subtype=text, collation="en_US");
+create type textrange_en_us as range(subtype=text, collation="en-x-icu");
 select textrange_c('A','Z') @> 'b'::text;
  ?column? 
 ----------
@@ -1078,3 +1100,27 @@ select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
+NOTICE:  drop cascades to 18 other objects
+DETAIL:  drop cascades to table collate_test1
+drop cascades to table collate_test_like
+drop cascades to table collate_test2
+drop cascades to table collate_test3
+drop cascades to type testdomain_sv
+drop cascades to table collate_test4
+drop cascades to table collate_test5
+drop cascades to table collate_test10
+drop cascades to table collate_test6
+drop cascades to view collview1
+drop cascades to view collview2
+drop cascades to view collview3
+drop cascades to type testdomain
+drop cascades to function mylt(text,text)
+drop cascades to function mylt_noninline(text,text)
+drop cascades to function mylt_plpgsql(text,text)
+drop cascades to function mylt2(text,text)
+drop cascades to function dup(anyelement)
+RESET search_path;
+-- leave a collation for pg_upgrade test
+CREATE COLLATION coll_icu_upgrade FROM "und-x-icu";
diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out
index 293e78641e..26275c3fb3 100644
--- a/src/test/regress/expected/collate.linux.utf8.out
+++ b/src/test/regress/expected/collate.linux.utf8.out
@@ -4,12 +4,14 @@
  * because other encodings don't support all the characters used.
  */
 SET client_encoding TO UTF8;
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
 CREATE TABLE collate_test1 (
     a int,
     b text COLLATE "en_US" NOT NULL
 );
 \d collate_test1
-           Table "public.collate_test1"
+        Table "collate_tests.collate_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
@@ -40,7 +42,7 @@ CREATE TABLE collate_test_like (
     LIKE collate_test1
 );
 \d collate_test_like
-         Table "public.collate_test_like"
+      Table "collate_tests.collate_test_like"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
@@ -364,6 +366,38 @@ SELECT * FROM collate_test1 WHERE b ~* 'bc';
  4 | ABC
 (4 rows)
 
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+  b  | is_alpha | is_upper | is_lower | is_digit | is_alnum | is_graph | is_print | is_punct | is_space 
+-----+----------+----------+----------+----------+----------+----------+----------+----------+----------
+ abc | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ABC | t        | t        | f        | f        | t        | t        | t        | f        | f
+ 123 | f        | f        | f        | t        | t        | t        | t        | f        | f
+ ab1 | f        | f        | f        | f        | t        | t        | t        | f        | f
+ a1! | f        | f        | f        | f        | f        | t        | t        | f        | f
+ a c | f        | f        | f        | f        | f        | f        | t        | f        | f
+ !.; | f        | f        | f        | f        | f        | t        | t        | t        | f
+     | f        | f        | f        | f        | f        | f        | t        | f        | t
+ äbç | t        | f        | t        | f        | t        | t        | t        | f        | f
+ ÄBÇ | t        | t        | f        | f        | t        | t        | t        | f        | f
+(10 rows)
+
 SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
  true 
 ------
@@ -980,6 +1014,8 @@ ERROR:  parameter "lc_ctype" must be specified
 CREATE COLLATION testx (locale = 'nonsense'); -- fail
 ERROR:  could not create locale "nonsense": No such file or directory
 DETAIL:  The operating system could not find any locale data for the locale name "nonsense".
+CREATE COLLATION testy (locale = 'en_US.utf8', version = 'foo'); -- fail, no versions for libc
+ERROR:  collation "testy" has no actual version, but a version was specified
 CREATE COLLATION test4 FROM nonsense;
 ERROR:  collation "nonsense" for encoding "UTF8" does not exist
 CREATE COLLATION test5 FROM test0;
@@ -993,7 +1029,7 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%' ORDER BY 1;
 
 ALTER COLLATION test1 RENAME TO test11;
 ALTER COLLATION test0 RENAME TO test11; -- fail
-ERROR:  collation "test11" for encoding "UTF8" already exists in schema "public"
+ERROR:  collation "test11" for encoding "UTF8" already exists in schema "collate_tests"
 ALTER COLLATION test1 RENAME TO test22; -- fail
 ERROR:  collation "test1" for encoding "UTF8" does not exist
 ALTER COLLATION test11 OWNER TO regress_test_role;
@@ -1005,11 +1041,11 @@ SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
     FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
     WHERE collname LIKE 'test%'
     ORDER BY 1;
- collname |   nspname   | obj_description 
-----------+-------------+-----------------
- test0    | public      | US English
- test11   | test_schema | 
- test5    | public      | 
+ collname |    nspname    | obj_description 
+----------+---------------+-----------------
+ test0    | collate_tests | US English
+ test11   | test_schema   | 
+ test5    | collate_tests | 
 (3 rows)
 
 DROP COLLATION test0, test_schema.test11, test5;
@@ -1024,6 +1060,9 @@ SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
 
 DROP SCHEMA test_schema;
 DROP ROLE regress_test_role;
+-- ALTER
+ALTER COLLATION "en_US" REFRESH VERSION;
+NOTICE:  version has not changed
 -- dependencies
 CREATE COLLATION test0 FROM "C";
 CREATE TABLE collate_dep_test1 (a int, b text COLLATE test0);
@@ -1048,13 +1087,13 @@ drop cascades to composite type collate_dep_test2 column y
 drop cascades to view collate_dep_test3
 drop cascades to index collate_dep_test4i
 \d collate_dep_test1
-         Table "public.collate_dep_test1"
+      Table "collate_tests.collate_dep_test1"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  a      | integer |           |          | 
 
 \d collate_dep_test2
-     Composite type "public.collate_dep_test2"
+ Composite type "collate_tests.collate_dep_test2"
  Column |  Type   | Collation | Nullable | Default 
 --------+---------+-----------+----------+---------
  x      | integer |           |          | 
@@ -1078,3 +1117,24 @@ select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
+NOTICE:  drop cascades to 18 other objects
+DETAIL:  drop cascades to table collate_test1
+drop cascades to table collate_test_like
+drop cascades to table collate_test2
+drop cascades to table collate_test3
+drop cascades to type testdomain_sv
+drop cascades to table collate_test4
+drop cascades to table collate_test5
+drop cascades to table collate_test10
+drop cascades to table collate_test6
+drop cascades to view collview1
+drop cascades to view collview2
+drop cascades to view collview3
+drop cascades to type testdomain
+drop cascades to function mylt(text,text)
+drop cascades to function mylt_noninline(text,text)
+drop cascades to function mylt_plpgsql(text,text)
+drop cascades to function mylt2(text,text)
+drop cascades to function dup(anyelement)
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.icu.sql
similarity index 80%
copy from src/test/regress/sql/collate.linux.utf8.sql
copy to src/test/regress/sql/collate.icu.sql
index c349cbde2b..ef39445b30 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.icu.sql
@@ -1,31 +1,32 @@
 /*
- * This test is for Linux/glibc systems and assumes that a full set of
- * locales is installed.  It must be run in a database with UTF-8 encoding,
- * because other encodings don't support all the characters used.
+ * This test is for ICU collations.
  */
 
 SET client_encoding TO UTF8;
 
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
+
 
 CREATE TABLE collate_test1 (
     a int,
-    b text COLLATE "en_US" NOT NULL
+    b text COLLATE "en-x-icu" NOT NULL
 );
 
 \d collate_test1
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "ja_JP.eucjp"
+    b text COLLATE "ja_JP.eucjp-x-icu"
 );
 
 CREATE TABLE collate_test_fail (
     a int,
-    b text COLLATE "foo"
+    b text COLLATE "foo-x-icu"
 );
 
 CREATE TABLE collate_test_fail (
-    a int COLLATE "en_US",
+    a int COLLATE "en-x-icu",
     b text
 );
 
@@ -37,7 +38,7 @@ CREATE TABLE collate_test_like (
 
 CREATE TABLE collate_test2 (
     a int,
-    b text COLLATE "sv_SE"
+    b text COLLATE "sv-x-icu"
 );
 
 CREATE TABLE collate_test3 (
@@ -57,11 +58,11 @@ CREATE TABLE collate_test3 (
 SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc';
 SELECT * FROM collate_test1 WHERE b >= 'bbc' COLLATE "C";
 SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "C";
-SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en_US";
+SELECT * FROM collate_test1 WHERE b COLLATE "C" >= 'bbc' COLLATE "en-x-icu";
 
 
-CREATE DOMAIN testdomain_sv AS text COLLATE "sv_SE";
-CREATE DOMAIN testdomain_i AS int COLLATE "sv_SE"; -- fails
+CREATE DOMAIN testdomain_sv AS text COLLATE "sv-x-icu";
+CREATE DOMAIN testdomain_i AS int COLLATE "sv-x-icu"; -- fails
 CREATE TABLE collate_test4 (
     a int,
     b testdomain_sv
@@ -71,7 +72,7 @@ CREATE TABLE collate_test4 (
 
 CREATE TABLE collate_test5 (
     a int,
-    b testdomain_sv COLLATE "en_US"
+    b testdomain_sv COLLATE "en-x-icu"
 );
 INSERT INTO collate_test5 SELECT * FROM collate_test1;
 SELECT a, b FROM collate_test5 ORDER BY b;
@@ -89,15 +90,15 @@ CREATE TABLE collate_test5 (
 SELECT * FROM collate_test3 ORDER BY b;
 
 -- constant expression folding
-SELECT 'bbc' COLLATE "en_US" > 'äbc' COLLATE "en_US" AS "true";
-SELECT 'bbc' COLLATE "sv_SE" > 'äbc' COLLATE "sv_SE" AS "false";
+SELECT 'bbc' COLLATE "en-x-icu" > 'äbc' COLLATE "en-x-icu" AS "true";
+SELECT 'bbc' COLLATE "sv-x-icu" > 'äbc' COLLATE "sv-x-icu" AS "false";
 
 -- upper/lower
 
 CREATE TABLE collate_test10 (
     a int,
-    x text COLLATE "en_US",
-    y text COLLATE "tr_TR"
+    x text COLLATE "en-x-icu",
+    y text COLLATE "tr-x-icu"
 );
 
 INSERT INTO collate_test10 VALUES (1, 'hij', 'hij'), (2, 'HIJ', 'HIJ');
@@ -116,11 +117,11 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ILIKE 'abc%';
 SELECT * FROM collate_test1 WHERE b ILIKE '%bc%';
 
-SELECT 'Türkiye' COLLATE "en_US" ILIKE '%KI%' AS "true";
-SELECT 'Türkiye' COLLATE "tr_TR" ILIKE '%KI%' AS "false";
+SELECT 'Türkiye' COLLATE "en-x-icu" ILIKE '%KI%' AS "true";
+SELECT 'Türkiye' COLLATE "tr-x-icu" ILIKE '%KI%' AS "false";
 
-SELECT 'bıt' ILIKE 'BIT' COLLATE "en_US" AS "false";
-SELECT 'bıt' ILIKE 'BIT' COLLATE "tr_TR" AS "true";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "en-x-icu" AS "false";
+SELECT 'bıt' ILIKE 'BIT' COLLATE "tr-x-icu" AS "true";
 
 -- The following actually exercises the selectivity estimation for ILIKE.
 SELECT relname FROM pg_class WHERE relname ILIKE 'abc%';
@@ -134,21 +135,42 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ~* '^abc';
 SELECT * FROM collate_test1 WHERE b ~* 'bc';
 
-SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
-SELECT 'Türkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
-
-SELECT 'bıt' ~* 'BIT' COLLATE "en_US" AS "false";
-SELECT 'bıt' ~* 'BIT' COLLATE "tr_TR" AS "true";
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en-x-icu"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+
+SELECT 'Türkiye' COLLATE "en-x-icu" ~* 'KI' AS "true";
+SELECT 'Türkiye' COLLATE "tr-x-icu" ~* 'KI' AS "true";  -- true with ICU
+
+SELECT 'bıt' ~* 'BIT' COLLATE "en-x-icu" AS "false";
+SELECT 'bıt' ~* 'BIT' COLLATE "tr-x-icu" AS "false";  -- false with ICU
 
 -- The following actually exercises the selectivity estimation for ~*.
 SELECT relname FROM pg_class WHERE relname ~* '^abc';
 
 
+/* not run by default because it requires tr_TR system locale
 -- to_char
 
 SET lc_time TO 'tr_TR';
 SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
-SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr_TR");
+SELECT to_char(date '2010-04-01', 'DD TMMON YYYY' COLLATE "tr-x-icu");
+*/
 
 
 -- backwards parsing
@@ -218,9 +240,9 @@ CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM
 
 -- collation mismatch between recursive and non-recursive term
 WITH RECURSIVE foo(x) AS
-   (SELECT x FROM (VALUES('a' COLLATE "en_US"),('b')) t(x)
+   (SELECT x FROM (VALUES('a' COLLATE "en-x-icu"),('b')) t(x)
    UNION ALL
-   SELECT (x || 'c') COLLATE "de_DE" FROM foo WHERE length(x) < 10)
+   SELECT (x || 'c') COLLATE "de-x-icu" FROM foo WHERE length(x) < 10)
 SELECT * FROM foo;
 
 
@@ -268,7 +290,7 @@ CREATE FUNCTION mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
 end
 $$;
 
-SELECT mylt2('a', 'B' collate "en_US") as t, mylt2('a', 'B' collate "C") as f;
+SELECT mylt2('a', 'B' collate "en-x-icu") as t, mylt2('a', 'B' collate "C") as f;
 
 CREATE OR REPLACE FUNCTION
   mylt2 (x text, y text) RETURNS boolean LANGUAGE plpgsql AS $$
@@ -320,23 +342,21 @@ CREATE SCHEMA test_schema;
 -- We need to do this this way to cope with varying names for encodings:
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test0 (locale = ' ||
+  EXECUTE 'CREATE COLLATION test0 (provider = icu, locale = ' ||
           quote_literal(current_setting('lc_collate')) || ');';
 END
 $$;
 CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
-CREATE COLLATION IF NOT EXISTS test0 FROM "C"; -- ok, skipped
-CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
 do $$
 BEGIN
-  EXECUTE 'CREATE COLLATION test1 (lc_collate = ' ||
+  EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
           quote_literal(current_setting('lc_collate')) ||
           ', lc_ctype = ' ||
           quote_literal(current_setting('lc_ctype')) || ');';
 END
 $$;
-CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
-CREATE COLLATION testx (locale = 'nonsense'); -- fail
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails with ICU */  DROP COLLATION testx;
 
 CREATE COLLATION test4 FROM nonsense;
 CREATE COLLATION test5 FROM test0;
@@ -368,6 +388,11 @@ CREATE COLLATION test5 FROM test0;
 DROP ROLE regress_test_role;
 
 
+-- ALTER
+
+ALTER COLLATION "en-x-icu" REFRESH VERSION;
+
+
 -- dependencies
 
 CREATE COLLATION test0 FROM "C";
@@ -391,10 +416,18 @@ CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
 -- test range types and collations
 
 create type textrange_c as range(subtype=text, collation="C");
-create type textrange_en_us as range(subtype=text, collation="en_US");
+create type textrange_en_us as range(subtype=text, collation="en-x-icu");
 
 select textrange_c('A','Z') @> 'b'::text;
 select textrange_en_us('A','Z') @> 'b'::text;
 
 drop type textrange_c;
 drop type textrange_en_us;
+
+
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
+RESET search_path;
+
+-- leave a collation for pg_upgrade test
+CREATE COLLATION coll_icu_upgrade FROM "und-x-icu";
diff --git a/src/test/regress/sql/collate.linux.utf8.sql b/src/test/regress/sql/collate.linux.utf8.sql
index c349cbde2b..b51162e3a1 100644
--- a/src/test/regress/sql/collate.linux.utf8.sql
+++ b/src/test/regress/sql/collate.linux.utf8.sql
@@ -6,6 +6,9 @@
 
 SET client_encoding TO UTF8;
 
+CREATE SCHEMA collate_tests;
+SET search_path = collate_tests;
+
 
 CREATE TABLE collate_test1 (
     a int,
@@ -134,6 +137,25 @@ CREATE TABLE collate_test10 (
 SELECT * FROM collate_test1 WHERE b ~* '^abc';
 SELECT * FROM collate_test1 WHERE b ~* 'bc';
 
+CREATE TABLE collate_test6 (
+    a int,
+    b text COLLATE "en_US"
+);
+INSERT INTO collate_test6 VALUES (1, 'abc'), (2, 'ABC'), (3, '123'), (4, 'ab1'),
+                                 (5, 'a1!'), (6, 'a c'), (7, '!.;'), (8, '   '),
+                                 (9, 'äbç'), (10, 'ÄBÇ');
+SELECT b,
+       b ~ '^[[:alpha:]]+$' AS is_alpha,
+       b ~ '^[[:upper:]]+$' AS is_upper,
+       b ~ '^[[:lower:]]+$' AS is_lower,
+       b ~ '^[[:digit:]]+$' AS is_digit,
+       b ~ '^[[:alnum:]]+$' AS is_alnum,
+       b ~ '^[[:graph:]]+$' AS is_graph,
+       b ~ '^[[:print:]]+$' AS is_print,
+       b ~ '^[[:punct:]]+$' AS is_punct,
+       b ~ '^[[:space:]]+$' AS is_space
+FROM collate_test6;
+
 SELECT 'Türkiye' COLLATE "en_US" ~* 'KI' AS "true";
 SELECT 'Türkiye' COLLATE "tr_TR" ~* 'KI' AS "false";
 
@@ -337,6 +359,7 @@ CREATE COLLATION IF NOT EXISTS test0 (locale = 'foo'); -- ok, skipped
 $$;
 CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
 CREATE COLLATION testx (locale = 'nonsense'); -- fail
+CREATE COLLATION testy (locale = 'en_US.utf8', version = 'foo'); -- fail, no versions for libc
 
 CREATE COLLATION test4 FROM nonsense;
 CREATE COLLATION test5 FROM test0;
@@ -368,6 +391,11 @@ CREATE COLLATION test5 FROM test0;
 DROP ROLE regress_test_role;
 
 
+-- ALTER
+
+ALTER COLLATION "en_US" REFRESH VERSION;
+
+
 -- dependencies
 
 CREATE COLLATION test0 FROM "C";
@@ -398,3 +426,7 @@ CREATE INDEX collate_dep_test4i ON collate_dep_test4t (b COLLATE test0);
 
 drop type textrange_c;
 drop type textrange_en_us;
+
+
+-- cleanup
+DROP SCHEMA collate_tests CASCADE;
-- 
2.12.0

#77

Andreas Karlsson

andreas@proxel.se

almost 9 years ago

In reply to: Peter Eisentraut (#76)

Re: ICU integration

I am fine with this version of the patch. The issues I have with it,
which I mentioned earlier in this thread, seem to be issues with ICU
rather than with this patch. For example there seems to be no way for
ICU to validate the syntax of the BCP 47 locales (or ICU's old format).
But I think we will just have to accept the weirdness of how ICU handles
locales.

I think this patch is ready to be committed.

Found a typo in the documentation:

"The inspect the currently available locales" should be "To inspect the
currently available locales".

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#78

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Andreas Karlsson (#77)

Re: ICU integration

On 3/23/17 05:34, Andreas Karlsson wrote:

I am fine with this version of the patch. The issues I have with it,
which I mentioned earlier in this thread, seem to be issues with ICU
rather than with this patch. For example there seems to be no way for
ICU to validate the syntax of the BCP 47 locales (or ICU's old format).
But I think we will just have to accept the weirdness of how ICU handles
locales.

I think this patch is ready to be committed.

Found a typo in the documentation:

"The inspect the currently available locales" should be "To inspect the
currently available locales".

Committed.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#79

Jeff Janes

jeff.janes@gmail.com

almost 9 years ago

In reply to: Peter Eisentraut (#78)

Re: ICU integration

On Thu, Mar 23, 2017 at 12:34 PM, Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

On 3/23/17 05:34, Andreas Karlsson wrote:

I am fine with this version of the patch. The issues I have with it,
which I mentioned earlier in this thread, seem to be issues with ICU
rather than with this patch. For example there seems to be no way for
ICU to validate the syntax of the BCP 47 locales (or ICU's old format).
But I think we will just have to accept the weirdness of how ICU handles
locales.

I think this patch is ready to be committed.

Found a typo in the documentation:

"The inspect the currently available locales" should be "To inspect the
currently available locales".

Committed.

This has broken the C locale, and the build farm.

if (pg_database_encoding_max_length() > 1 || locale->provider ==
COLLPROVIDER_ICU)

segfaults because locale is null. (locale_is_c is true)

Cheers,

Jeff

#80

Thomas Munro

thomas.munro@enterprisedb.com

almost 9 years ago

In reply to: Peter Eisentraut (#78)

Re: ICU integration

On Fri, Mar 24, 2017 at 8:34 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Committed.

On a macOS system using MacPorts, I have "icu" installed. MacPorts is
a package manager that installs stuff under /opt/local. I have
/opt/local/bin in $PATH so that pkg-config can be found. I run
./configure with --with-icu but without mentioning any special paths
and it says:

checking whether to build with ICU support... yes
checking for pkg-config... /opt/local/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for icu-uc icu-i18n... yes

... but then building fails, because there are no headers in the search path:

gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -Wno-unused-command-line-argument -g -O2 -Wall -Werror
-I../../../src/include/snowball
-I../../../src/include/snowball/libstemmer -I../../../src/include -c
-o dict_snowball.o dict_snowball.c -MMD -MP -MF .deps/dict_snowball.Po
...
../../../src/include/utils/pg_locale.h:19:10: fatal error:
'unicode/ucol.h' file not found

But pkg-config does know where to find those headers:

$ pkg-config --cflags icu-i18n
-I/opt/local/include

... and it's not wrong:

$ ls /opt/local/include/unicode/ucol.h
/opt/local/include/unicode/ucol.h

So I think there may be a bug in the configure script: I think it
should be putting pkg-config's --cflags output into one of the
CFLAGS-like variables.

Or am I misunderstanding what pkg-config is supposed to be doing for us here?

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#81

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Thomas Munro (#80)

Re: ICU integration

On 3/23/17 16:45, Thomas Munro wrote:

On Fri, Mar 24, 2017 at 8:34 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Committed.

On a macOS system using MacPorts, I have "icu" installed. MacPorts is
a package manager that installs stuff under /opt/local. I have
/opt/local/bin in $PATH so that pkg-config can be found. I run
./configure with --with-icu but without mentioning any special paths
and it says:

checking whether to build with ICU support... yes
checking for pkg-config... /opt/local/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for icu-uc icu-i18n... yes

... but then building fails, because there are no headers in the search path:

Fixed.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#82

Craig Ringer

craig@2ndquadrant.com

almost 9 years ago

In reply to: Peter Eisentraut (#81)

Re: ICU integration

On 24 March 2017 at 04:54, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Fixed.

Congratulations on getting this done. It's great work, and it'll make
a whole class of potential bugs and platform portability warts go away
if widely adopted.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#83

Peter Geoghegan

pg@bowt.ie

almost 9 years ago

In reply to: Craig Ringer (#82)

Re: ICU integration

On Thu, Mar 23, 2017 at 6:31 PM, Craig Ringer <craig@2ndquadrant.com> wrote:

Congratulations on getting this done. It's great work, and it'll make
a whole class of potential bugs and platform portability warts go away
if widely adopted.

I would like to see us rigorously define a standard for when Postgres
installations are binary compatible. There should be a simple tool.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers