Tsearch2 / PG 8.2 Which stemmer files?

Started by Hannes Dorbathover 19 years ago4 messagesgeneral
Jump to latest
#1Hannes Dorbath
light@theendofthetunnel.de

Which stemmer files is one supposed to use with 8.2 Tsearch2?

Trying to compile the output from Gendict with:

stem_UTF_8_german.c
stem_UTF_8_german.h

from:

http://snowball.tartarus.org/dist/libstemmer_c.tgz

gives:

http://hannes.imos.net/make.txt

Thanks!

--
Regards,
Hannes Dorbath

#2Hannes Dorbath
light@theendofthetunnel.de
In reply to: Hannes Dorbath (#1)
Re: Tsearch2 / PG 8.2 Which stemmer files?

On 07.12.2006 12:42, Hannes Dorbath wrote:

Which stemmer files is one supposed to use with 8.2 Tsearch2?

Found an answer myself. Seems I need:

http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/tsearch_snowball_82.gz

--
Regards,
Hannes Dorbath

#3Oleg Bartunov
oleg@sai.msu.su
In reply to: Hannes Dorbath (#1)
Re: Tsearch2 / PG 8.2 Which stemmer files?

Hannes,

please download patch tsearch_snowball_82.gz
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/
which updates API to snowball.

Oleg
On Thu, 7 Dec 2006, Hannes Dorbath wrote:

Which stemmer files is one supposed to use with 8.2 Tsearch2?

Trying to compile the output from Gendict with:

stem_UTF_8_german.c
stem_UTF_8_german.h

from:

http://snowball.tartarus.org/dist/libstemmer_c.tgz

gives:

http://hannes.imos.net/make.txt

Thanks!

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

#4Hannes Dorbath
light@theendofthetunnel.de
In reply to: Oleg Bartunov (#3)
Re: Tsearch2 / PG 8.2 Which stemmer files?

Thank you Oleg.

I have a bit more trouble migrating from 8.1.5 TSearch2 + Gin/UTF-8 to
PG 8.2.

First I tried to use existing dict and affix files, which triggered that
oldFormat condition. So I tried to start from scratch. The thing I can't
get to work is compound word support for German again.

What I did:

1. OpenOffice Dictionary from http://j3e.de/hunspell/de_DE.zip
2. extract de_DE.dic
3. Run compound.pl on de_DE.dic
4. Put modified de_DE.dic back in the zip, run my2ispell on them
5. Convert both to UTF-8

Do I need to hack compound.pl to do something different, as the affix
format changed?

I'd really appreciate any hint.

Thanks!

On 07.12.2006 14:52, Oleg Bartunov wrote:

please download patch tsearch_snowball_82.gz
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/
which updates API to snowball.

--
Regards,
Hannes Dorbath