Full Text Search - Slow on common words

Started by sub3over 15 years ago3 messagesgeneral
Jump to latest
#1sub3
steve@subwest.com

Hi,

I have a small web page set up to search within my domain based on keywords.
One of the queries is:
SELECT page.id ts_rank_cd('{1.0, 1.0, 1.0, 1.0}',contFTI,q) FROM page,
to_tsquery('steve') as q WHERE contFTI @@ q

My problem is: when someone puts in a commonly seen word, the system slows
down and takes a while because of the large amount of data being returned
(retrieved from the table) & processed by the rand_cd function.

How does everyone else handle something like this? I can only think of 2
possible solutions:
- change the query to search for the same terms at least twice in the same
document (can I do that?)
- limit any searches to x results before ranking & tell the user their
search criteria is too generic.

Is there a better solution that I am missing?

Thanks,
Steve

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Full-Text-Search-Slow-on-common-words-tp3241060p3241060.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

#2Reid Thompson
Reid.Thompson@ateb.com
In reply to: sub3 (#1)
Re: Full Text Search - Slow on common words

On Thu, 2010-10-28 at 12:08 -0700, sub3 wrote:

Hi,

I have a small web page set up to search within my domain based on keywords.
One of the queries is:
SELECT page.id ts_rank_cd('{1.0, 1.0, 1.0, 1.0}',contFTI,q) FROM page,
to_tsquery('steve') as q WHERE contFTI @@ q

My problem is: when someone puts in a commonly seen word, the system slows
down and takes a while because of the large amount of data being returned
(retrieved from the table) & processed by the rand_cd function.

How does everyone else handle something like this? I can only think of 2
possible solutions:
- change the query to search for the same terms at least twice in the same
document (can I do that?)
- limit any searches to x results before ranking & tell the user their
search criteria is too generic.

Is there a better solution that I am missing?

if the keyword is that common, is it really a keyword? Exclude it.

#3Dann Corbit
DCorbit@connx.com
In reply to: Reid Thompson (#2)
Re: Full Text Search - Slow on common words

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Reid Thompson
Sent: Thursday, October 28, 2010 12:57 PM
To: steve@subwest.com
Cc: Reid Thompson; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Full Text Search - Slow on common words

On Thu, 2010-10-28 at 12:08 -0700, sub3 wrote:

Hi,

I have a small web page set up to search within my domain based on keywords.
One of the queries is:
SELECT page.id ts_rank_cd('{1.0, 1.0, 1.0, 1.0}',contFTI,q) FROM page,
to_tsquery('steve') as q WHERE contFTI @@ q

My problem is: when someone puts in a commonly seen word, the system slows
down and takes a while because of the large amount of data being returned
(retrieved from the table) & processed by the rand_cd function.

How does everyone else handle something like this? I can only think of 2
possible solutions:
- change the query to search for the same terms at least twice in the same
document (can I do that?)
- limit any searches to x results before ranking & tell the user their
search criteria is too generic.

Is there a better solution that I am missing?

if the keyword is that common, is it really a keyword? Exclude it.

This general idea is called a stopword list. You create a list of words that are so common that searching on them is counter-productive.

http://en.wikipedia.org/wiki/Stop_words

<<