BUG #1972: index error with space character
The following bug has been logged online:
Bug reference: 1972
Logged by: Eduardo Soares
Email address: edurbs@gmail.com
PostgreSQL version: 8.0.3
Operating system: Linux Fedora 4
Description: index error with space character
Details:
In above example the "AZTES Z" should be after the "AZTESA". It happens with
any encoding type. The DB not see the space character. The index shoulb see
the space and put "AZTES Z" together with "AZTES". In above list the
"AZTESA" should be the first.
Thanks for the help.
table=# insert into edu values ('AZTES Z');
INSERT 133634 1
table=# insert into edu values ('AZTESA');
INSERT 133635 1
table=# SELECT * FROM EDU ORDER BY NOME DESC;
nome
---------------
AZTES Z
AZTESA
AZTES
ÃNTES
ANTES
(8 registros)
Eduardo Soares wrote:
Operating system: Linux Fedora 4
Description: index error with space character
Details:In above example the "AZTES Z" should be after the "AZTESA". It happens with
any encoding type. The DB not see the space character. The index shoulb see
the space and put "AZTES Z" together with "AZTES". In above list the
"AZTESA" should be the first.
Sorting order is determined by your locale, and is different from your
encoding. For example, en_GB ignores spaces but C doesn't:
$ LC_COLLATE=en_GB.UTF-8 sort unsorted.txt
aa a
aaaa
aaab
aa b
$ LC_COLLATE=C sort unsorted.txt
aa a
aa b
aaaa
aaab
See "man locale" for details on how to find out what locales are setup
on your machine. See the documentation for details on how to set locale
on a database cluster.
HTH
--
Richard Huxton
Archonet Ltd
---------- Forwarded message ----------
From: Eduardo RBS <edurbs@gmail.com>
Date: 18/10/2005 13:31
Subject: Re: [BUGS] BUG #1972: index error with space character
To: Richard Huxton <dev@archonet.com>
Hellow..
Thank you very much for the attention.
I need a locate that not ignores the space chracater and also sort
accents like á or ã.
I made configuration using locale... see it..
Using C is almost good because it sort correctly and see the spaces..
but it does not sort the portuguese accents.. note the last two
lines..
$ LC_COLLATE=C sort b.txt
aa a
aa z
aaaa
aaaz
aaz
eado
eza
édina
émaster
and with the "my" locale pt_BR it sort correctly the accents but
ignores the space chracater....
LC_COLLATE=pt_BR.utf8 sort b.txt
aa a
aaaa
aaab
aaaz
aaz
aa z
eado
édina
émaster
eza
What i need should a merge of C and pt_BR.. i mean.. a locale that see
the spaces like C but sort accents like pt_BR..
I tried several others locales.. and only C see the space character.
Thanks for the attention.
--
[]'s
Eduardo RBS
http://linuxstok.sourceforge.net
2005/10/18, Richard Huxton <dev@archonet.com>:
Eduardo Soares wrote:
Operating system: Linux Fedora 4
Description: index error with space character
Details:In above example the "AZTES Z" should be after the "AZTESA". It happens with
any encoding type. The DB not see the space character. The index shoulb see
the space and put "AZTES Z" together with "AZTES". In above list the
"AZTESA" should be the first.Sorting order is determined by your locale, and is different from your
encoding. For example, en_GB ignores spaces but C doesn't:$ LC_COLLATE=en_GB.UTF-8 sort unsorted.txt
aa a
aaaa
aaab
aa b$ LC_COLLATE=C sort unsorted.txt
aa a
aa b
aaaa
aaabSee "man locale" for details on how to find out what locales are setup
on your machine. See the documentation for details on how to set locale
on a database cluster.HTH
--
Richard Huxton
Archonet Ltd
--
[]'s
Eduardo RBS
http://linuxstok.sourceforge.net
Import Notes
Reply to msg id not found: 16ec5e7e0510180831m6fb3b98dx@mail.gmail.com
Eduardo RBS <edurbs@gmail.com> writes:
I need a locate that not ignores the space chracater and also sort
accents like � or �.
I'm afraid you'll have to learn how to build your own locale definition.
AFAIK, the "C" locale is the *only* common locale in which spaces aren't
second-class citizens.
I know that it is possible to write your own locale definition, but
this is not the place to ask about how. Try a glibc support forum.
regards, tom lane