tsearch: how to get a list of stopwords?

Started by Joerg Erdmengeralmost 23 years ago4 messagesgeneral
Jump to latest
#1Joerg Erdmenger
joe@woerd.com

Hi there,

me again. How do I find the stopwords that tsearch uses in its standard
configuration? I've looked at contrib/tsearch/dict/porter_english.dct and get
a feeling it's somewhere in there but I can't decipher it. Any suggestions?

Joerg

#2Oleg Bartunov
oleg@sai.msu.su
In reply to: Joerg Erdmenger (#1)
Re: tsearch: how to get a list of stopwords?

On Thu, 28 Aug 2003, Joerg Erdmenger wrote:

Hi there,

me again. How do I find the stopwords that tsearch uses in its standard
configuration? I've looked at contrib/tsearch/dict/porter_english.dct and get
a feeling it's somewhere in there but I can't decipher it. Any suggestions?

You're right. They're encoded in engstoptree :)
I suggest you not bother with old tsearch and look to tsearch2 version
which is much improved both in performance and flexibility.
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Oleg

Joerg

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#3Joerg Erdmenger
joe@woerd.com
In reply to: Oleg Bartunov (#2)
Re: tsearch: how to get a list of stopwords?

hi

me again. How do I find the stopwords that tsearch uses in its standard
configuration? I've looked at contrib/tsearch/dict/porter_english.dct and
get a feeling it's somewhere in there but I can't decipher it. Any
suggestions?

You're right. They're encoded in engstoptree :)
I suggest you not bother with old tsearch and look to tsearch2 version
which is much improved both in performance and flexibility.
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

well, I would like but I've got to get it to work on a production server; I
will try to get the admins to install it but I guess it will take some time -
meanwhile - is there anyway to get to the list of stopwords so that I can
build a filter for those as a temporary workaround?

thanks

Joerg

#4Oleg Bartunov
oleg@sai.msu.su
In reply to: Joerg Erdmenger (#3)
Re: tsearch: how to get a list of stopwords?

On Thu, 28 Aug 2003, Joerg Erdmenger wrote:

hi

me again. How do I find the stopwords that tsearch uses in its standard
configuration? I've looked at contrib/tsearch/dict/porter_english.dct and
get a feeling it's somewhere in there but I can't decipher it. Any
suggestions?

You're right. They're encoded in engstoptree :)
I suggest you not bother with old tsearch and look to tsearch2 version
which is much improved both in performance and flexibility.
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

well, I would like but I've got to get it to work on a production server; I
will try to get the admins to install it but I guess it will take some time -
meanwhile - is there anyway to get to the list of stopwords so that I can
build a filter for those as a temporary workaround?

tsearch2 could live with tsearch, so you may play with it.
I attached english.stop file from OpenFTS distribution. But I'm not 100% sure
it's the same as in portereng.c :)

thanks

Joerg

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Attachments:

english.stoptext/plain; charset=US-ASCII; name=english.stopDownload