prefix search in tsearch

Started by Erik Rijkersover 15 years ago4 messagesdocs
Jump to latest
#1Erik Rijkers
er@xs4all.nl

[docs from cvs HEAD]

I found the text-search documentation a little unclear about 'prefix search'; specifically, the
examples do not show that the so-called 'prefix' is first stemmed, before it is used as prefix.

For instance, the following can be a little surprising:

SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
?column?
----------
t
(1 row)

Because prefix search is such an important functionality I think this should be better explained,
which I hope the attached doc-patch does.

(In textsearch.sgml is another mention + example of prefix search, perhaps it should be extended a
little there too - which I'm happy to do as well, but I first wanted to see if you agree that it
is a little too obscure as it stands)

Erik Rijkers

Attachments:

datatype.sgml.difftext/x-patch; name=datatype.sgml.diffDownload+19-1
#2Oleg Bartunov
oleg@sai.msu.su
In reply to: Erik Rijkers (#1)
Re: prefix search in tsearch

Erik,

I think it'd be more clear if you say not 'stemmed', but processed in
according to configuration. Here is an example:

$SHAREDIR/tsearch_data/my_synonyms.syn contains one line:
one 1

CREATE TEXT SEARCH DICTIONARY my_synonym (
TEMPLATE = synonym,
SYNONYMS = my_synonyms
);

ALTER TEXT SEARCH CONFIGURATION english
ALTER MAPPING FOR asciiword
WITH my_synonym, english_stem;

test=# select 'one'::tsvector @@ to_tsquery('english','one:*');
?column?
----------
f
(1 row)

because 'one' was processed by my_synonym dictionary.

test=# select ts_debug('english','one');
ts_debug
------------------------------------------------------------------------------
(asciiword,"Word, all ASCII",one,"{my_synonym,english_stem}",my_synonym,{1})
(1 row)

On Tue, 31 Aug 2010, Erik Rijkers wrote:

[docs from cvs HEAD]

I found the text-search documentation a little unclear about 'prefix search'; specifically, the
examples do not show that the so-called 'prefix' is first stemmed, before it is used as prefix.

For instance, the following can be a little surprising:

SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
?column?
----------
t
(1 row)

Because prefix search is such an important functionality I think this should be better explained,
which I hope the attached doc-patch does.

(In textsearch.sgml is another mention + example of prefix search, perhaps it should be extended a
little there too - which I'm happy to do as well, but I first wanted to see if you agree that it
is a little too obscure as it stands)

Erik Rijkers

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

#3Bruce Momjian
bruce@momjian.us
In reply to: Oleg Bartunov (#2)
Re: prefix search in tsearch

I applied a modified documentation patch (attached) that includes Oleg's
suggestions.

---------------------------------------------------------------------------

Oleg Bartunov wrote:

Erik,

I think it'd be more clear if you say not 'stemmed', but processed in
according to configuration. Here is an example:

$SHAREDIR/tsearch_data/my_synonyms.syn contains one line:
one 1

CREATE TEXT SEARCH DICTIONARY my_synonym (
TEMPLATE = synonym,
SYNONYMS = my_synonyms
);

ALTER TEXT SEARCH CONFIGURATION english
ALTER MAPPING FOR asciiword
WITH my_synonym, english_stem;

test=# select 'one'::tsvector @@ to_tsquery('english','one:*');
?column?
----------
f
(1 row)

because 'one' was processed by my_synonym dictionary.

test=# select ts_debug('english','one');
ts_debug
------------------------------------------------------------------------------
(asciiword,"Word, all ASCII",one,"{my_synonym,english_stem}",my_synonym,{1})
(1 row)

On Tue, 31 Aug 2010, Erik Rijkers wrote:

[docs from cvs HEAD]

I found the text-search documentation a little unclear about 'prefix search'; specifically, the
examples do not show that the so-called 'prefix' is first stemmed, before it is used as prefix.

For instance, the following can be a little surprising:

SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
?column?
----------
t
(1 row)

Because prefix search is such an important functionality I think this should be better explained,
which I hope the attached doc-patch does.

(In textsearch.sgml is another mention + example of prefix search, perhaps it should be extended a
little there too - which I'm happy to do as well, but I first wanted to see if you agree that it
is a little too obscure as it stands)

Erik Rijkers

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachments:

/rtmp/ts.difftext/x-diffDownload+23-23
#4Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#3)
Re: prefix search in tsearch

I came up with some better wording, which I have applied:

This query will match any word in a <type>tsvector</> that begins
with <quote>super</>. Note that prefixes are first processed by
text search configurations, which means this comparison returns
true:

---------------------------------------------------------------------------

bruce wrote:

I applied a modified documentation patch (attached) that includes Oleg's
suggestions.

---------------------------------------------------------------------------

Oleg Bartunov wrote:

Erik,

I think it'd be more clear if you say not 'stemmed', but processed in
according to configuration. Here is an example:

$SHAREDIR/tsearch_data/my_synonyms.syn contains one line:
one 1

CREATE TEXT SEARCH DICTIONARY my_synonym (
TEMPLATE = synonym,
SYNONYMS = my_synonyms
);

ALTER TEXT SEARCH CONFIGURATION english
ALTER MAPPING FOR asciiword
WITH my_synonym, english_stem;

test=# select 'one'::tsvector @@ to_tsquery('english','one:*');
?column?
----------
f
(1 row)

because 'one' was processed by my_synonym dictionary.

test=# select ts_debug('english','one');
ts_debug
------------------------------------------------------------------------------
(asciiword,"Word, all ASCII",one,"{my_synonym,english_stem}",my_synonym,{1})
(1 row)

On Tue, 31 Aug 2010, Erik Rijkers wrote:

[docs from cvs HEAD]

I found the text-search documentation a little unclear about 'prefix search'; specifically, the
examples do not show that the so-called 'prefix' is first stemmed, before it is used as prefix.

For instance, the following can be a little surprising:

SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
?column?
----------
t
(1 row)

Because prefix search is such an important functionality I think this should be better explained,
which I hope the attached doc-patch does.

(In textsearch.sgml is another mention + example of prefix search, perhaps it should be extended a
little there too - which I'm happy to do as well, but I first wanted to see if you agree that it
is a little too obscure as it stands)

Erik Rijkers

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +