Postgres 12: backend crashes when creating non-deterministic collation

Started by Thomas Kellererover 6 years ago14 messagesgeneral
Jump to latest
#1Thomas Kellerer
spam_eater@gmx.net

I was trying to learn how the new non-deterministic collations in v12 work, but the following makes the backend crash:

CREATE COLLATION de_ci (provider = icu, locale = 'de-x-icu', deterministic = false);

Which leads to:

2019-10-04 11:54:23 CEST LOG: server process (PID 7540) was terminated by exception 0xC0000005
2019-10-04 11:54:23 CEST DETAIL: Failed process was running:
CREATE COLLATION de_ci (provider = icu, locale = 'de-x-icu', deterministic = false)
2019-10-04 11:54:23 CEST HINT: See C include file "ntstatus.h" for a description of the hexadecimal value.
2019-10-04 11:54:23 CEST LOG: terminating any other active server processes
2019-10-04 11:54:23 CEST WARNING: terminating connection because of crash of another server process
2019-10-04 11:54:23 CEST DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

This is on Windows 10 with the Postgres 12 binaries from EDB.
Exact Postgres version is: PostgreSQL 12.0, compiled by Visual C++ build 1914, 64-bit
The database was pg_upgraded if that makes any difference

I might have misunderstood how to use deterministic to create a case-insensitive collation, but I don't think the backend should crash if I do something wrong ;)

Regards
Thomas

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Thomas Kellerer (#1)
Re: Postgres 12: backend crashes when creating non-deterministic collation

Thomas Kellerer wrote:

I was trying to learn how the new non-deterministic collations in v12
work, but the following makes the backend crash:

CREATE COLLATION de_ci (provider = icu, locale = 'de-x-icu',
deterministic = false);

Which leads to:

2019-10-04 11:54:23 CEST LOG: server process (PID 7540) was
terminated by exception 0xC0000005

I might have misunderstood how to use deterministic to create a case-
insensitive collation, but I don't think the backend should crash if
I do something wrong ;)

Yes, there is a bug somewhere. FWIW, it works on my Linux system.

To get a case insensitive collation you'd have to use something like

LOCALE = 'de-DE-u-ks-level2'

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#3Thomas Kellerer
spam_eater@gmx.net
In reply to: Laurenz Albe (#2)
Re: Postgres 12: backend crashes when creating non-deterministic collation

Laurenz Albe schrieb am 04.10.2019 um 16:04:

I was trying to learn how the new non-deterministic collations in v12
work, but the following makes the backend crash:

CREATE COLLATION de_ci (provider = icu, locale = 'de-x-icu',
deterministic = false);

Which leads to:

2019-10-04 11:54:23 CEST LOG: server process (PID 7540) was
terminated by exception 0xC0000005

I might have misunderstood how to use deterministic to create a case-
insensitive collation, but I don't think the backend should crash if
I do something wrong ;)

Yes, there is a bug somewhere. FWIW, it works on my Linux system.

It also works on Windows when I specify "correct" locale names - the above seems to be an edge case.
Is it worth the effort to report that through the bug reporting form?

To get a case insensitive collation you'd have to use something like

LOCALE = 'de-DE-u-ks-level2'

Creating works, but apparently on Windows ICU does not support this.

The following works fine on Linux (returns both rows), but not on Windows (returns nothing)

create collation de_ci (provider = icu, locale = 'de-DE-u-ks-level2', deterministic = false);
create table test (name text);
insert into test values ('FOO'), ('Foo');

select *
from test
where name = 'foo' collate de_ci;

Not a big deal, but might surprise some people.

Thomas

#4Daniel Verite
daniel@manitou-mail.org
In reply to: Thomas Kellerer (#3)
Re: Postgres 12: backend crashes when creating non-deterministic collation

Thomas Kellerer wrote:

It also works on Windows when I specify "correct" locale names - the above
seems to be an edge case.
Is it worth the effort to report that through the bug reporting form?

Sure. Both the crash with 'de-x-icu' and the difference in behavior
between Linux and Windows on locale = 'de-DE-u-ks-level2' look like
"must-fix" bugs to me. Both are valid ICU locales and should work the
same in all operating systems.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#5Andreas Kretschmer
andreas@a-kretschmer.de
In reply to: Thomas Kellerer (#1)
Re: Postgres 12: backend crashes when creating non-deterministic collation

Am 04.10.19 um 12:13 schrieb Thomas Kellerer:

I was trying to learn how the new non-deterministic collations in v12
work, but the following makes the backend crash:

CREATE COLLATION de_ci (provider = icu, locale = 'de-x-icu',
deterministic = false);

Which leads to:

2019-10-04 11:54:23 CEST   LOG:  server process (PID 7540) was
terminated by exception 0xC0000005
2019-10-04 11:54:23 CEST   DETAIL:  Failed process was running:
    CREATE COLLATION de_ci (provider = icu, locale = 'de-x-icu',
deterministic = false)
2019-10-04 11:54:23 CEST   HINT:  See C include file "ntstatus.h" for
a description of the hexadecimal value.
2019-10-04 11:54:23 CEST   LOG:  terminating any other active server
processes
2019-10-04 11:54:23 CEST   WARNING:  terminating connection because of
crash of another server process
2019-10-04 11:54:23 CEST   DETAIL:  The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.

This is on Windows 10 with the Postgres 12 binaries from EDB.
Exact Postgres version is: PostgreSQL 12.0, compiled by Visual C++
build 1914, 64-bit
The database was pg_upgraded if that makes any difference

works for me, with:

psql (12rc1 (Ubuntu 12~rc1-1.pgdg18.04+1))
Type "help" for help.

test=# CREATE COLLATION de_ci (provider = icu, locale = 'de-x-icu',
deterministic = false);
CREATE COLLATION
test=*# commit;
COMMIT
test=#

Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Kellerer (#3)
Re: Postgres 12: backend crashes when creating non-deterministic collation

Thomas Kellerer <spam_eater@gmx.net> writes:

It also works on Windows when I specify "correct" locale names - the above seems to be an edge case.
Is it worth the effort to report that through the bug reporting form?

No, this thread is a sufficient report. What *would* be a good use
of time is to get a stack trace from the crash, if you can.

regards, tom lane

#7Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#6)
Re: Postgres 12: backend crashes when creating non-deterministic collation

On 2019-10-04 10:52:38 -0400, Tom Lane wrote:

Thomas Kellerer <spam_eater@gmx.net> writes:

It also works on Windows when I specify "correct" locale names - the above seems to be an edge case.
Is it worth the effort to report that through the bug reporting form?

No, this thread is a sufficient report. What *would* be a good use
of time is to get a stack trace from the crash, if you can.

FWIW, https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows
might be helpful.

#8Daniel Verite
daniel@manitou-mail.org
In reply to: Thomas Kellerer (#3)
Re: Postgres 12: backend crashes when creating non-deterministic collation

Thomas Kellerer wrote:

To get a case insensitive collation you'd have to use something like

LOCALE = 'de-DE-u-ks-level2'

Creating works, but apparently on Windows ICU does not support this.

After installing v12 on windows with the EDB installer, I notice
that it ships with ICU 53, a relatively old version (2014).

Concerning the problem just above (not the crash), ICU 53 is too old
to support BCP47 tags as collation attributes, as mentioned
at https://www.postgresql.org/docs/12/collation.html :

"The first example selects the ICU locale using a “language tag” per
BCP 47. The second example uses the traditional ICU-specific locale
syntax. The first style is preferred going forward, but it is not
supported by older ICU versions.

With ICU 53 or older, instead of the locale above, we must use the
old-style syntax:

locale = 'de-DE@colStrength=secondary'

If you use that in your example, the case insensitive lookups should
work.

But it's unfortunate that the EDB build did not switch to a recent ICU
version for PostgreSQL 12.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#9Thomas Kellerer
spam_eater@gmx.net
In reply to: Daniel Verite (#8)
Re: Postgres 12: backend crashes when creating non-deterministic collation

Daniel Verite schrieb am 04.10.2019 um 18:49:

Creating works, but apparently on Windows ICU does not support this.

After installing v12 on windows with the EDB installer, I notice
that it ships with ICU 53, a relatively old version (2014).

Concerning the problem just above (not the crash), ICU 53 is too old
to support BCP47 tags as collation attributes, as mentioned
at https://www.postgresql.org/docs/12/collation.html :

"The first example selects the ICU locale using a “language tag” per
BCP 47. The second example uses the traditional ICU-specific locale
syntax. The first style is preferred going forward, but it is not
supported by older ICU versions.

With ICU 53 or older, instead of the locale above, we must use the
old-style syntax:

locale = 'de-DE@colStrength=secondary'

If you use that in your example, the case insensitive lookups should
work.

That indeed works, thanks a lot.

#10Thomas Kellerer
spam_eater@gmx.net
In reply to: Tom Lane (#6)
Re: Postgres 12: backend crashes when creating non-deterministic collation

Tom Lane schrieb am 04.10.2019 um 16:52:

Is it worth the effort to report that through the bug reporting form?

No, this thread is a sufficient report. What *would* be a good use
of time is to get a stack trace from the crash, if you can.

I don't know if I did everything correctly, but here it is. I hope it helps

icuuc53.dll!0000000064964a80() Unbekannt

icuuc53.dll!0000000064964c2d() Unbekannt
icuuc53.dll!0000000064966328() Unbekannt
icuuc53.dll!0000000064965469() Unbekannt
icuuc53.dll!000000006495ef28() Unbekannt
icuuc53.dll!0000000064961501() Unbekannt
icuuc53.dll!000000006495b330() Unbekannt
icuuc53.dll!0000000064959b9e() Unbekannt
icuin53.dll!0000000064a8bd92() Unbekannt
postgres.exe!get_collation_actual_version(char collprovider, const char * collcollate) Zeile 1533 C
postgres.exe!DefineCollation(ParseState * pstate, List * names, List * parameters, bool if_not_exists) Zeile 218 C
postgres.exe!ProcessUtilitySlow(ParseState * pstate, PlannedStmt * pstmt, const char * queryString, ProcessUtilityContext context, ParamListInfoData * params, QueryEnvironment * queryEnv, _DestReceiver * dest, char * completionTag) Zeile 1292 C
postgres.exe!standard_ProcessUtility(PlannedStmt * pstmt, const char * queryString, ProcessUtilityContext context, ParamListInfoData * params, QueryEnvironment * queryEnv, _DestReceiver * dest, char * completionTag) Zeile 933 C
postgres.exe!ProcessUtility(PlannedStmt * pstmt, const char * queryString, ProcessUtilityContext context, ParamListInfoData * params, QueryEnvironment * queryEnv, _DestReceiver * dest, char * completionTag) Zeile 363 C
postgres.exe!PortalRunUtility(PortalData * portal, PlannedStmt * pstmt, bool isTopLevel, bool setHoldSnapshot, _DestReceiver * dest, char * completionTag) Zeile 1184 C
postgres.exe!PortalRunMulti(PortalData * portal, bool isTopLevel, bool setHoldSnapshot, _DestReceiver * dest, _DestReceiver * altdest, char * completionTag) Zeile 1323 C
postgres.exe!PortalRun(PortalData * portal, long count, bool isTopLevel, bool run_once, _DestReceiver * dest, _DestReceiver * altdest, char * completionTag) Zeile 800 C
postgres.exe!exec_execute_message(const char * portal_name, long max_rows) Zeile 2098 C
postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname, const char * username) Zeile 4299 C
[Inlineframe] postgres.exe!BackendRun(Port *) Zeile 4431 C
postgres.exe!SubPostmasterMain(int argc, char * * argv) Zeile 4953 C
postgres.exe!main(int argc, char * * argv) Zeile 216 C

So it happens somewhere inside the ICU DLL - but I don't have the symbols for that

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Kellerer (#10)
Re: Postgres 12: backend crashes when creating non-deterministic collation

Thomas Kellerer <spam_eater@gmx.net> writes:

Tom Lane schrieb am 04.10.2019 um 16:52:

No, this thread is a sufficient report. What *would* be a good use
of time is to get a stack trace from the crash, if you can.

I don't know if I did everything correctly, but here it is. I hope it helps

icuuc53.dll!0000000064964a80() Unbekannt

icuuc53.dll!0000000064964c2d() Unbekannt
icuuc53.dll!0000000064966328() Unbekannt
icuuc53.dll!0000000064965469() Unbekannt
icuuc53.dll!000000006495ef28() Unbekannt
icuuc53.dll!0000000064961501() Unbekannt
icuuc53.dll!000000006495b330() Unbekannt
icuuc53.dll!0000000064959b9e() Unbekannt
icuin53.dll!0000000064a8bd92() Unbekannt
postgres.exe!get_collation_actual_version(char collprovider, const char * collcollate) Zeile 1533 C
postgres.exe!DefineCollation(ParseState * pstate, List * names, List * parameters, bool if_not_exists) Zeile 218 C
postgres.exe!ProcessUtilitySlow(ParseState * pstate, PlannedStmt * pstmt, const char * queryString, ProcessUtilityContext context, ParamListInfoData * params, QueryEnvironment * queryEnv, _DestReceiver * dest, char * completionTag) Zeile 1292 C
postgres.exe!standard_ProcessUtility(PlannedStmt * pstmt, const char * queryString, ProcessUtilityContext context, ParamListInfoData * params, QueryEnvironment * queryEnv, _DestReceiver * dest, char * completionTag) Zeile 933 C

Hm. This trace says that the crash happened somewhere down inside ICU
itself, during the ucol_open() call in get_collation_actual_version().
There isn't much we could have done to mess up the arguments to that
function. That would seem to mean that it's ICU's bug not ours.
Maybe another reason not to be using such an old ICU version :-(.

regards, tom lane

#12Thomas Kellerer
spam_eater@gmx.net
In reply to: Tom Lane (#11)
Re: Postgres 12: backend crashes when creating non-deterministic collation

Tom Lane schrieb am 04.10.2019 um 19:36:

Hm. This trace says that the crash happened somewhere down inside ICU
itself, during the ucol_open() call in get_collation_actual_version().
There isn't much we could have done to mess up the arguments to that
function. That would seem to mean that it's ICU's bug not ours.
Maybe another reason not to be using such an old ICU version :-(.

I would like to test this with a newer ICU version.

So I managed to setup the build environment with Visual Studio, but I can't figure out how to enable ICU for the build.

I created a config.pl to specify the location of the downloaded ICU libraries with the following content

# Configuration arguments for vcbuild.
use strict;
use warnings;

our $config = {
icu => "d:\Projects\postgres\libs\icu\lib64\" # --with-icu=<path>
};

The build is successful, but when I run "install targetfolder" no ICU libraries are included and pg_config only shows:

CONFIGURE = --enable-thread-safety --with-ldap --without-zlib

I have no idea what I am missing.

I also tried building with Msys2 but even when I run configure with the --with-icu option, no ICU DLLs are copied

Regards
Thomas

#13Thomas Kellerer
spam_eater@gmx.net
In reply to: Thomas Kellerer (#12)
Re: Postgres 12: backend crashes when creating non-deterministic collation

Thomas Kellerer schrieb am 05.10.2019 um 13:39:

Hm.  This trace says that the crash happened somewhere down inside ICU
itself, during the ucol_open() call in get_collation_actual_version().
There isn't much we could have done to mess up the arguments to that
function.  That would seem to mean that it's ICU's bug not ours.
Maybe another reason not to be using such an old ICU version :-(.

I would like to test this with a newer ICU version.

So I managed to setup the build environment with Visual Studio, but I can't figure out how to enable ICU for the build.

Ah, figured it out.

config.pl has a different format compared to config_default.pl

$config->{icu}='d:\Projects\postgres\libs\icu'

did the trick, and postgres was built with ICU support.

I can confirm that with ICU 65 the crash does not occur and the case insensitive comparison works fine as well.

Regards
Thomas

#14Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#11)
Re: Postgres 12: backend crashes when creating non-deterministic collation

On 2019-10-04 19:36, Tom Lane wrote:

Hm. This trace says that the crash happened somewhere down inside ICU
itself, during the ucol_open() call in get_collation_actual_version().
There isn't much we could have done to mess up the arguments to that
function. That would seem to mean that it's ICU's bug not ours.

Some build farm coverage of Windows+ICU would be nice. We have test
cases in place that might have caught this.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services