Search

Started by Dmitry Iv Popovover 26 years ago4 messagesgeneral
Jump to latest
#1Dmitry Iv Popov
mitlas@auricvision.com

Greetings,

I store text data in windows-1251 encoding in my pgSQL database
and when I execute SQL command like

SELECT title, firstname, lastname, description FROM stuff
WHERE title LIKE '%<keyword>%' AND firtsname LIKE '%<keyword>%'
AND lastname LIKE '%<keyword>%' AND description LIKE '%<keyword>%'

the response comes really quick, but if I want case independent
search, I should convert, using my own PHP3 function, both search
field and keyword to low case just becouse PostgreSQL by itself
doesn't know how to convert characters in windows-1251 encoding
to low case. And response time grows from 0:07 to 1:10!

Is there any solution? May be stored procedures (that doesn't
require to recompile pgSQL)? Thank you in advance!!!

P.S.: as far as I know there is no locale for windows-1251 and,
unfortunately, I can't store data in koi8-r whose locale present
in almost all UNIX'es.

Mentioned above function:
function lower($str)
{
$length = strlen($str);

for ($i = 0; $i < $length; ++$i)
{
$sym = ord($str[$i]);
if (($sym > 34 && $sym < 60) || ($sym > 191 && $sym < 224))
$str[$i] = chr($sym + 32);
}

return $str;
}

--
BEST WISHES, Dmitry
Webmaster, DeltaCube Studio
http://www.deltacube.com/

#2Gilles Darold
darold@neptune.fr
In reply to: Dmitry Iv Popov (#1)
Re: [GENERAL] Search

Hi,

For case insensitive you must use the regexp search as follow :

SELECT title, firstname, lastname, description FROM stuff
WHERE title *~ '.*<keyword>.*' AND firtsname *~ '.*<keyword>.*';

P.S.: as far as I know there is no locale for windows-1251 and,
unfortunately, I can't store data in koi8-r whose locale present
in almost all UNIX'es.

I don't know about windows-1251. Perhaps this can't help you. But if
you have japanese in you database, you can proceed a search on it so
why not windows-1251 ?

A+

Gilles

#3Herouth Maoz
herouth@oumail.openu.ac.il
In reply to: Gilles Darold (#2)
Re: [GENERAL] Search

At 03:43 +0300 on 01/08/1999, Gilles Darold wrote:

I don't know about windows-1251. Perhaps this can't help you. But if
you have japanese in you database, you can proceed a search on it so
why not windows-1251 ?

Because the backend has to know that the lowercase for char NNN in this
codepage is MMM. This is different from one locale to the next. If it were
windows-1255 (Hebrew), there would be no lowercase at all for any character
above 224. So of course it's different than Japanese.

There can be two solutions to this problem:

1) Write a function using SPI, install it on the backend, and use
it for the comparison.

2) Create the new locale, or at least the LC_CTYPE part of the locale,
on the unix you are using.

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma

#4Oleg Broytmann
phd@emerald.netskate.ru
In reply to: Dmitry Iv Popov (#1)
Re: [GENERAL] Search

On Sat, 31 Jul 1999, Dmitry Popov wrote:

I store text data in windows-1251 encoding in my pgSQL database
and when I execute SQL command like

SELECT title, firstname, lastname, description FROM stuff
WHERE title LIKE '%<keyword>%' AND firtsname LIKE '%<keyword>%'
AND lastname LIKE '%<keyword>%' AND description LIKE '%<keyword>%'

the response comes really quick, but if I want case independent
search, I should convert, using my own PHP3 function, both search
field and keyword to low case just becouse PostgreSQL by itself
doesn't know how to convert characters in windows-1251 encoding
to low case. And response time grows from 0:07 to 1:10!

Is there any solution? May be stored procedures (that doesn't
require to recompile pgSQL)? Thank you in advance!!!

P.S.: as far as I know there is no locale for windows-1251 and,
unfortunately, I can't store data in koi8-r whose locale present
in almost all UNIX'es.

Incorrect solution: install win1251 locale. Once I saw a reference to
such beast in Russian Apache mailing list. Search through mail archive,
look into Apache-Rus FAQs...

Correct solution: use koi8-r locale and koi8<=>win1251 on-the-fly
conversion. Postgres starting from 6.5 can do on the fly conversion between
many Cyrillic locales. Look into doc/README.locale.

--
BEST WISHES, Dmitry
Webmaster, DeltaCube Studio
http://www.deltacube.com/

Oleg.
----
Oleg Broytmann http://members.xoom.com/phd2/ phd2@earthling.net
Programmers don't die, they just GOSUB without RETURN.