"simple" dict with stop words in tsearch2

Started by Pierre Thibaudeauabout 19 years ago3 messagesgeneral
Jump to latest
#1Pierre Thibaudeau
pierdeux@gmail.com

In tsearch2, I would like to use the "simple" dictionary along with my
own list of stopwords.

In other words, once the text is parsed into tokens, no stemming
whatsoever, but stopwords are removed.

Is there an easy way to produce that result, using the standard
"simple" dictionary?

#2Oleg Bartunov
oleg@sai.msu.su
In reply to: Pierre Thibaudeau (#1)
Re: "simple" dict with stop words in tsearch2

On Mon, 29 Jan 2007, Pierre Thibaudeau wrote:

In tsearch2, I would like to use the "simple" dictionary along with my
own list of stopwords.

In other words, once the text is parsed into tokens, no stemming
whatsoever, but stopwords are removed.

Is there an easy way to produce that result, using the standard
"simple" dictionary?

sure, just specify dict_initoption. For example,
test=# update pg_ts_dict set dict_initoption='contrib/english.stop' where dict_name='simple';
UPDATE 1
test=# select lexize('simple', 'the');
lexize
--------
{}
(1 row)

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

#3Pierre Thibaudeau
pierdeux@gmail.com
In reply to: Oleg Bartunov (#2)
Re: "simple" dict with stop words in tsearch2

Brilliant! Thank you!

Show quoted text

In tsearch2, I would like to use the "simple" dictionary along with my
own list of stopwords.
[...]

sure, just specify dict_initoption. For example,
test=# update pg_ts_dict set dict_initoption='contrib/english.stop' where dict_name='simple';
UPDATE 1
test=# select lexize('simple', 'the');
lexize
--------
{}
(1 row)