tsearch2 & dictionaries - possible problem

Started by Ivan Vorasabout 16 years ago2 messagesgeneral
Jump to latest
#1Ivan Voras
ivoras@freebsd.org

hello,

I think I have a problem with tsearch2 configuration I'm trying to use.
I have created a text search configuration as:

--
CREATE TEXT SEARCH DICTIONARY hr_ispell (
TEMPLATE = ispell,
DictFile = 'hr',
AffFile = 'hr',
StopWords = 'hr'
);

CREATE TEXT SEARCH CONFIGURATION public.ts2hr (COPY=pg_catalog.english);

ALTER TEXT SEARCH CONFIGURATION ts2hr
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, word,
hword, hword_part
WITH hr_ispell;

SET default_text_search_config = 'public.ts2hr';
--

and here are some queries:

--
cms=> select to_tsvector('voras vorasom');
to_tsvector
-------------

(1 row)

cms=> SET default_text_search_config = 'simple';
SET
cms=> select to_tsvector('voras vorasom');
to_tsvector
-----------------------
'voras':1 'vorasom':2
(1 row)

cms=> SET default_text_search_config = 'ts2hr';
SET
cms=> select to_tsvector('voras vorasom');
to_tsvector
-------------

(1 row)

cms=> select to_tsvector('kiša kiši');
to_tsvector
-------------
'kiša':1,2
(1 row)
--

The good news is that the text search configuration is actually used
(the 'kiša kiši') example but apparently on an uncommon word,
to_tsvector() returns nothing (the 'voras vorasom' example).

Is there something wrong in the configuration? I would definitely not
want unknown words to be ignored.

#2Oleg Bartunov
oleg@sai.msu.su
In reply to: Ivan Voras (#1)
Re: tsearch2 & dictionaries - possible problem

Ivan,

did you found your misunderstooding ? You forget how dictionaries work.
You need to put some dictionary, which recognize anything, like simple, or
stemmer dictionary to recognize 'unknown' word. Look into documentation.

Oleg
On Wed, 2 Jun 2010, Ivan Voras wrote:

hello,

I think I have a problem with tsearch2 configuration I'm trying to use.
I have created a text search configuration as:

--
CREATE TEXT SEARCH DICTIONARY hr_ispell (
TEMPLATE = ispell,
DictFile = 'hr',
AffFile = 'hr',
StopWords = 'hr'
);

CREATE TEXT SEARCH CONFIGURATION public.ts2hr (COPY=pg_catalog.english);

ALTER TEXT SEARCH CONFIGURATION ts2hr
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, word,
hword, hword_part
WITH hr_ispell;

SET default_text_search_config = 'public.ts2hr';
--

and here are some queries:

--
cms=> select to_tsvector('voras vorasom');
to_tsvector
-------------

(1 row)

cms=> SET default_text_search_config = 'simple';
SET
cms=> select to_tsvector('voras vorasom');
to_tsvector
-----------------------
'voras':1 'vorasom':2
(1 row)

cms=> SET default_text_search_config = 'ts2hr';
SET
cms=> select to_tsvector('voras vorasom');
to_tsvector
-------------

(1 row)

cms=> select to_tsvector('kiЪЪa kiЪЪi');
to_tsvector
-------------
'kiЪЪa':1,2
(1 row)
--

The good news is that the text search configuration is actually used
(the 'kiЪЪa kiЪЪi') example but apparently on an uncommon word,
to_tsvector() returns nothing (the 'voras vorasom' example).

Is there something wrong in the configuration? I would definitely not
want unknown words to be ignored.

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83