BUG #16020: ICU Collations querys

Started by PG Bug reporting formover 6 years ago5 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 16020
Logged by: Marina Garrido Sanchez
Email address: garridosanchezmarina@gmail.com
PostgreSQL version: 12beta4
Operating system: Windows
Description:

When I use a particular ICU Collation query with locale =
'es-ES-u-ks-level1' and deterministic = false for having case- and accet-
insensitive, I can use similiar, like and ilike as the documentation said,
but how can I do a parcial search without this operators?

#2Peter Eisentraut
peter_e@gmx.net
In reply to: PG Bug reporting form (#1)
Re: BUG #16020: ICU Collations querys

On 2019-09-25 08:51, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 16020
Logged by: Marina Garrido Sanchez
Email address: garridosanchezmarina@gmail.com
PostgreSQL version: 12beta4
Operating system: Windows
Description:

When I use a particular ICU Collation query with locale =
'es-ES-u-ks-level1' and deterministic = false for having case- and accet-
insensitive, I can use similiar, like and ilike as the documentation said,
but how can I do a parcial search without this operators?

This doesn't sound like a bug, so it's inappropriate for this forum.

Also, please provide some more detail, including exact queries and what
you are hoping for as a result.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Marina Garrido Sanchez
garridosanchezmarina@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #16020: ICU Collations querys

Hi,

Sorry I dont see other way to send my issue or question, and I assumed that
as It is something new (ICU Collations) for case and accet insensitive it
would be better to do in this forum.

The questions is...that if I do the following script:

create collation ca_insensitive (provider = icu, locale =
'es-ES-u-ks-level1', deterministic = false;

create table users (
nombre text collate "ca_insensitive" primary key unique,
apellidos text collate "ca_insensitive",
direccion text
);

insert into user values ("jávier", "gonzález", "BBB")
insert into user values ("Javier", "Gonzalez", "BBB")

and if I do the query:

select * from users where users.apellidos ilike '%Gonz%';

I get the error that *ilike *does not support nondeterministic operation,
for that, my question is How can I do partial match search with case- and
accet- insensitive in postgres? or Which operator can I use to do the
search?

Thanks in advance.

El jue., 26 sept. 2019 a las 17:11, Marina Garrido Sanchez (<
garridosanchezmarina@gmail.com>) escribió:

Show quoted text

Hi,

Sorry I dont see other way to send my issue or question, and I assumed
that as It is something new (ICU Collations) for case and accet insensitive
it would be better to do in this forum.

The questions is...that if I do the following script:

create collation ca_insensitive (provider = icu, locale =
'es-ES-u-ks-level1', deterministic = false;

create table users (
nombre text collate "ca_insensitive" primary key unique,
apellidos text collate "ca_insensitive",
direccion text
);

insert into user values ("jávier", "gonzález", "BBB")
insert into user values ("Javier", "Gonzalez", "BBB")

and if I do the query:

select * from users where users.apellidos ilike '%Gonz%';

I get the error that *ilike *does not support nondeterministic operation,
for that, my question is How can I do partial match search with case- and
accet- insensitive in postgres? or Which operator can I use to do the
search?

Thanks in advance.

El mié., 25 sept. 2019 a las 22:09, Peter Eisentraut (<
peter.eisentraut@2ndquadrant.com>) escribió:

On 2019-09-25 08:51, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 16020
Logged by: Marina Garrido Sanchez
Email address: garridosanchezmarina@gmail.com
PostgreSQL version: 12beta4
Operating system: Windows
Description:

When I use a particular ICU Collation query with locale =
'es-ES-u-ks-level1' and deterministic = false for having case- and

accet-

insensitive, I can use similiar, like and ilike as the documentation

said,

but how can I do a parcial search without this operators?

This doesn't sound like a bug, so it's inappropriate for this forum.

Also, please provide some more detail, including exact queries and what
you are hoping for as a result.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Marina Garrido Sanchez (#3)
Re: BUG #16020: ICU Collations querys

On 2019-09-26 17:14, Marina Garrido Sanchez wrote:

create collation ca_insensitive (provider = icu, locale =
'es-ES-u-ks-level1', deterministic = false;

create table users (
nombre text collate "ca_insensitive" primary key unique,
apellidos text collate "ca_insensitive",
direccion text
);

insert into user values ("jávier", "gonzález", "BBB")
insert into user values ("Javier", "Gonzalez", "BBB")

and if I do the query:

select * from users where users.apellidos ilike '%Gonz%';

I get the error that *ilike *does not support nondeterministic
operation, for that, my question is How can I do partial match search
with case- and accet- insensitive in postgres? or Which operator can I
use to do the search?

You can run your query with an explicit deterministic collation applied,
for example:

select * from users where users.apellidos ilike '%Gonz%' collate "C";

or "und-x-icu" or whatever.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Daniel Verite
daniel@manitou-mail.org
In reply to: Marina Garrido Sanchez (#3)
Re: BUG #16020: ICU Collations querys

Marina Garrido Sanchez wrote:

create collation ca_insensitive (provider = icu, locale =
'es-ES-u-ks-level1', deterministic = false;

create table users (
nombre text collate "ca_insensitive" primary key unique,
apellidos text collate "ca_insensitive",
direccion text
);

insert into user values ("jávier", "gonzález", "BBB")
insert into user values ("Javier", "Gonzalez", "BBB")

Independently of the syntax problem (literals must be enclosed
in single quotes, not double quotes), the second insert should
fail since 'jávier' = 'Javier' with the above-defined collation and
there is a unique index on users.nombre.

select * from users where users.apellidos ilike '%Gonz%';

I get the error that *ilike *does not support nondeterministic operation,
for that, my question is How can I do partial match search with case- and
accet- insensitive in postgres? or Which operator can I use to do the
search?

If you're interested only in the case insensitiveness, you
could force a deterministic collation to the ilike argument,
for instance:

select * from users where users.apellidos ilike '%Gonz%' collate "es-x-icu";

If you really need collate-sensitive substring search with non-deterministic
collations, I don't think there is any way in Postgres 12 to get that.
If you can install the icu_ext extension [1]https://github.com/dverite/icu_ext#icu_strpos, its icu_strpos() function
implements that using the collation-aware string search feature in ICU.

As string like '%foo%' is equivalent to strpos(string, 'foo')>0
when string is associated to a deterministic collation,
string like '%foo%' with a non-derministic collation
is not supported but it is equivalent to icu_strpos(string, 'foo')>0.

For instance, with the "ca_insensitive" collation you defined,
both 'jáv' and ''Jav' match 'jav':

insert into users values ('jávier', 'gonzález', 'BBB');
insert into users values ('Javier2', 'Gonzalez', 'BBB');

select * from users where icu_strpos(nombre, 'jav')>0;
nombre | apellidos | direccion
---------+-----------+-----------
jávier | gonzález | BBB
Javier2 | Gonzalez | BBB

[1]: https://github.com/dverite/icu_ext#icu_strpos

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite