multi-word expression full-text searching

Started by Fco. Mario Barcalaalmost 18 years ago5 messagesgeneral
Jump to latest
#1Fco. Mario Barcala
lbarcala@freeresearch.org

Hello all:

I'm testing all full-text searching possibilities of PostgreSQL and...

Is it possible to search for a multi-word expression?

I can search one or more words:

SELECT id FROM document WHERE to_tsvector('english',text) @@
to_tsquery('english','despite');

SELECT id FROM document WHERE to_tsvector('english',text) @@
to_tsquery('english','despite & subject');

But it seems not be possible to do a query like:

SELECT id FROM document WHERE to_tsvector('english',text) @@
to_tsquery('english','despite this');

to search the documents in which occur the expression "despite this".
This last query gives the following error:

ERROR: syntax error in tsquery: "despite this"

Is it really impossible to searh a multi-word expression?

Thanks in advance,

Mario Barcala

#2Teodor Sigaev
teodor@sigaev.ru
In reply to: Fco. Mario Barcala (#1)
Re: multi-word expression full-text searching

SELECT id FROM document WHERE to_tsvector('english',text) @@
plainto_tsquery('english','despite this');
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#3Fco. Mario Barcala
lbarcala@freeresearch.org
In reply to: Teodor Sigaev (#2)
Re: multi-word expression full-text searching

SELECT id FROM document WHERE to_tsvector('english',text) @@
plainto_tsquery('english','despite this');
--
Teodor Sigaev

If I understand well the plainto_tsquery behaviour, this query match with:

Despite this, the president went out.
Despite the event, this question arise.

i.e., if "this" is not inside the stopwords list, the query is translated to:
SELECT id FROM document WHERE to_tsvector('english',text) @@
to_tsquery('english','despite & this');

It searches for documents which include "despite" and "this", and not for
ones which have the expression "despite this".

I have made some tests and they confirm my explanations.

Thank you anyway.

Any other solution?

Mario Barcala

#4Oleg Bartunov
oleg@sai.msu.su
In reply to: Fco. Mario Barcala (#3)
Re: multi-word expression full-text searching

On Tue, 1 Jul 2008, lbarcala@freeresearch.org wrote:

SELECT id FROM document WHERE to_tsvector('english',text) @@
plainto_tsquery('english','despite this');
--
Teodor Sigaev

If I understand well the plainto_tsquery behaviour, this query match with:

Despite this, the president went out.
Despite the event, this question arise.

You want 'phrase search', which doesn't supported yet. There are
several workarounds, search archives for 'phrase search'

i.e., if "this" is not inside the stopwords list, the query is translated to:
SELECT id FROM document WHERE to_tsvector('english',text) @@
to_tsquery('english','despite & this');

It searches for documents which include "despite" and "this", and not for
ones which have the expression "despite this".

I have made some tests and they confirm my explanations.

Thank you anyway.

Any other solution?

Mario Barcala

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

#5Teodor Sigaev
teodor@sigaev.ru
In reply to: Fco. Mario Barcala (#3)
Re: multi-word expression full-text searching

If I understand well the plainto_tsquery behaviour, this query match with:
Despite this, the president went out.
Despite the event, this question arise.

Right, you mean phrase search. Read the thread:
http://archives.postgresql.org/pgsql-hackers/2008-05/msg01111.php

Suggested patch should be made as module, I think.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/