full text index and most frequently used words

Started by Bryan Murphyabout 18 years ago2 messagesgeneral

bryan.murphy@gmail.com

about 18 years ago

I'm a bit of a novice writing tsearch2 queries, so forgive me if this
is a basic question.

We have a table with 2million+ records which has a considerable amount
of text content. Some search terms (such as comedy, new, news, music,
etc.) cause a significant performance hit on our web site. There are
simply too many records in the table, and the ranking function takes
too long to rank them all.

We've partially solved this problem by manually identifying
non-performant search queries and pre-caching the results (think
materialized view). However, this process is starting to be become a
burden, and we can't properly anticipate what our community is going
to be searching for in the future.

What I'd like to know is if there is an easy to way to use the full
text index to generate a list of the most common words. I could write
this code manually, but I'm hoping there's a better (simpler) way.

Thanks,
Bryan

Teodor Sigaev

teodor@sigaev.ru

about 18 years ago

In reply to: Bryan Murphy (#1)

Re: full text index and most frequently used words

What I'd like to know is if there is an easy to way to use the full
text index to generate a list of the most common words. I could write
this code manually, but I'm hoping there's a better (simpler) way.

For 8.3
http://www.postgresql.org/docs/8.3/static/textsearch-features.html#TEXTSEARCH-STATISTICS

For versions before 8.3 just use stat() function instead of ts_stat().
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/