LC_COLLATE and index usage

Started by Forest Wilkinsonalmost 23 years ago9 messagesgeneral
Jump to latest
#1Forest Wilkinson
lyris-pg@tibit.com

According to the postgres 7.3.2 docs, initializing a database cluster
with LC_COLLATE set to anything other than "C" will disable the use of
indexes during LIKE searches. This presents a bit of a problem for
me, since the client application I'm writing uses LIKE searches and
ships internationally. As I understand it, I can have good
performance, or locale-specific ORDER BY output, but not both. Is
there a plan to remove this limitation?

#2Forest Wilkinson
lyris-pg@tibit.com
In reply to: Forest Wilkinson (#1)
How to determine a database cluster's LC_COLLATE setting?

How can I determine what LC_COLLATE setting was used initialize a
postgres installation? At the moment, I'm trying to figure out what
collation locale was chosen by the RPM install of postgres 7.3.2, but
it would also be useful to know how to check this with any random
installation.

#3Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Forest Wilkinson (#1)
Re: LC_COLLATE and index usage

According to the postgres 7.3.2 docs, initializing a database cluster
with LC_COLLATE set to anything other than "C" will disable the use of
indexes during LIKE searches. This presents a bit of a problem for
me, since the client application I'm writing uses LIKE searches and
ships internationally. As I understand it, I can have good
performance, or locale-specific ORDER BY output, but not both. Is
there a plan to remove this limitation?

Question seconded (and tacit suggestion for a TODO item) on
behalf of GnuMed (www.gnumed.org).

Karsten Hilbert, MD
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

#4Dennis Gearon
gearond@cvc.net
In reply to: Forest Wilkinson (#1)
Re: LC_COLLATE and index usage

Personally,
I'd like to see LOCALE done away with, everything be in UTF-8, and have
a POSGRES extension that sets the sort order as an itemized listing of
mime type languages like:

SET COLLATION( en-us, fr, ru ).

Everything NOT in those character sets, falls below them in the sort
order, but stays in it's tabular order, otherwise.

PS. We are sure lucky in the western world that we have such a
simplistic alphabet.

Karsten Hilbert wrote:

Show quoted text

According to the postgres 7.3.2 docs, initializing a database cluster
with LC_COLLATE set to anything other than "C" will disable the use of
indexes during LIKE searches. This presents a bit of a problem for
me, since the client application I'm writing uses LIKE searches and
ships internationally. As I understand it, I can have good
performance, or locale-specific ORDER BY output, but not both. Is
there a plan to remove this limitation?

Question seconded (and tacit suggestion for a TODO item) on
behalf of GnuMed (www.gnumed.org).

Karsten Hilbert, MD
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

#5Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Dennis Gearon (#4)
Re: LC_COLLATE and index usage

I'd like to see LOCALE done away with, everything be in UTF-8, and have
a POSGRES extension that sets the sort order as an itemized listing of
mime type languages like:

SET COLLATION( en-us, fr, ru ).

Everything NOT in those character sets, falls below them in the sort
order, but stays in it's tabular order, otherwise.

Sure, I don't care either way as long as I get indexes on my
localized sort order (I do know I don't have a right to
request that unless I offer a patch ;-)

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

#6Ralph Graulich
maillist@shauny.de
In reply to: Karsten Hilbert (#3)
Re: LC_COLLATE and index usage

Hi Karsten,

Question seconded (and tacit suggestion for a TODO item) on
behalf of GnuMed (www.gnumed.org).

Depending on how your LIKE searches look like, you can use a little work
around, which improves speed dramatically. In case you need to search like
"<String>%", you can use a second table containing each possible
combination of the search string, a fixed length field and JOIN that
helper table.

So you can change your LIKE search to a search for a string of fixed
length.

Kind regards
... Ralph ...

#7Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Karsten Hilbert (#5)
Re: LC_COLLATE and index usage

On Thu, May 08, 2003 at 02:24:48AM +0200, Karsten Hilbert wrote:

(I do know I don't have a right to
request that unless I offer a patch ;-)

IMHO, we as users have the right, and probably the duty as well, to
request (reasonable?) things, because that keeps the project alive.

Feeding the To-Do list is one of the tasks of users; reporting bugs,
testing beta releases and appraising developers being some others.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Oh, great altar of passive entertainment, bestow upon me thy discordant images
at such speed as to render linear thought impossible" (Calvin a la TV)

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Forest Wilkinson (#2)
Re: How to determine a database cluster's LC_COLLATE setting?

"Forest Wilkinson" <lyris-pg@tibit.com> writes:

How can I determine what LC_COLLATE setting was used initialize a
postgres installation?

The pg_controldata utility program can tell you.

regards, tom lane

#9Andrew Sullivan
andrew@libertyrms.info
In reply to: Forest Wilkinson (#1)
Re: LC_COLLATE and index usage

On Wed, May 07, 2003 at 03:34:24PM -0700, Forest Wilkinson wrote:

ships internationally. As I understand it, I can have good
performance, or locale-specific ORDER BY output, but not both. Is
there a plan to remove this limitation?

I think there would be, if anyone could come up with a comprehensive
set of rules about how to handle these cases. See the -hackers
archives for plenty of discussion of this.

A

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110