improved support for compounds in tsearch2

Started by Oleg Bartunovover 21 years ago3 messageshackers
Jump to latest
#1Oleg Bartunov
oleg@sai.msu.su

Hi there,

we just submitted to CVS several changes to tsearch2:

1. change struct {} WordEntryPos to typedef uint16, for details see
http://www.pgsql.ru/db/mw/msg.html?mid=2035188
2. improved support for compound words

"A compound is a word containing a stem that is made up of more than one root"
to_tsquery() now make use of roots if dictionary (should support 'compoundwords' flag, check .aff
file) returns them for compound word. Example:

regression=# select to_tsquery( 'fotballklubber');
to_tsquery
------------------------------------------------
'fotball' & 'klubb' | 'fot' & 'ball' & 'klubb'
(1 row)

Bad thing is that API to tsearch2 dictionaries was changed !
See http://www.pgsql.ru/db/mw/msg.html?mid=2039406
for details and http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words
for introduction about compounds support in tsearch2.

We're badly needed testers of compounds support (german, norway,... languages),
patch for V8.0 release is available
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/expand_query_8.0.patch.gz

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#2Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Oleg Bartunov (#1)
Re: improved support for compounds in tsearch2

We're badly needed testers of compounds support (german, norway,...
languages),
patch for V8.0 release is available
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/expand_query_8.0.patch.gz

What I'm interested in is compound word support for English. For
example, if a food has the word 'beefburger' in it, how to I index that
as both 'beef' and 'burger'...

Chris

#3Oleg Bartunov
oleg@sai.msu.su
In reply to: Christopher Kings-Lynne (#2)
Re: improved support for compounds in tsearch2

On Tue, 25 Jan 2005, Christopher Kings-Lynne wrote:

We're badly needed testers of compounds support (german, norway,...
languages),
patch for V8.0 release is available
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/expand_query_8.0.patch.gz

What I'm interested in is compound word support for English. For example, if
a food has the word 'beefburger' in it, how to I index that as both 'beef'
and 'burger'...

Once you get ispell english dictionary with 'compoundwords' support,
or just write custom dictionary.

Chris

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83