a little fix for text search

Started by Oleg Bartunovover 9 years ago4 messagesdocs
Jump to latest
#1Oleg Bartunov
oleg@sai.msu.su

Hi there !

I don't know when exactly it was improved, but following notice in
https://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-HEADLINE
is currently not needed.

ts_headline uses the original document, not a tsvector summary, so it can
be slow and should be used with care. A typical mistake is to call
ts_headline for every matching document when only ten documents are to be
shown. SQL subqueries can help; here is an example:

Regards,
Oleg

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Oleg Bartunov (#1)
Re: a little fix for text search

Oleg Bartunov <obartunov@gmail.com> writes:

Hi there !
I don't know when exactly it was improved, but following notice in
https://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-HEADLINE
is currently not needed.

ts_headline uses the original document, not a tsvector summary, so it can
be slow and should be used with care. A typical mistake is to call
ts_headline for every matching document when only ten documents are to be
shown. SQL subqueries can help; here is an example:

I don't see why that stopped being appropriate? The point is that it
takes a raw text input which has to be re-parsed; that's still true
AFAICS.

regards, tom lane

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#3Oleg Bartunov
oleg@sai.msu.su
In reply to: Tom Lane (#2)
Re: a little fix for text search

On Sat, Nov 12, 2016 at 11:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Oleg Bartunov <obartunov@gmail.com> writes:

Hi there !
I don't know when exactly it was improved, but following notice in
https://www.postgresql.org/docs/current/static/textsearch-controls.html#

TEXTSEARCH-HEADLINE

is currently not needed.

ts_headline uses the original document, not a tsvector summary, so it can
be slow and should be used with care. A typical mistake is to call
ts_headline for every matching document when only ten documents are to be
shown. SQL subqueries can help; here is an example:

I don't see why that stopped being appropriate? The point is that it
takes a raw text input which has to be re-parsed; that's still true
AFAICS.

I mean that in the past we recommended to use subselect to avoid extra
ts_headline() call, which now, at least at 9.6, it's obsoleted and two sql
queries call ts_headline() exactly 5 times.

select ts_headline(body,to_tsquery('supernovae &
x-ray')),ts_rank(fts,to_tsquery('supernovae & x-ray')) as rank
from apod
where fts @@ to_tsquery('supernovae & x-ray') order by rank desc limit 5;
explain (analyze, costs off) select ts_headline(body,to_tsquery('supernovae
& x-ray')), rank from (
select body, ts_rank(fts,to_tsquery('supernovae & x-ray')) as rank from
apod where fts @@ to_tsquery('supernovae & x-ray')
order by rank desc limit 5
) as foo;

select ts_headline(body,to_tsquery('supernovae &
x-ray')),ts_rank(fts,to_tsquery('supernovae & x-ray')) as rank
from apod
where fts @@ to_tsquery('supernovae & x-ray') order by rank desc limit 5;

Show quoted text

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Oleg Bartunov (#3)
Re: a little fix for text search

Oleg Bartunov <obartunov@gmail.com> writes:

On Sat, Nov 12, 2016 at 11:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't see why that stopped being appropriate? The point is that it
takes a raw text input which has to be re-parsed; that's still true
AFAICS.

I mean that in the past we recommended to use subselect to avoid extra
ts_headline() call, which now, at least at 9.6, it's obsoleted and two sql
queries call ts_headline() exactly 5 times.

Oh, I see your point: commit 9118d03a8 fixed the planner so you don't get
extra evaluations of ts_headline() in this example. I think it's probably
still appropriate to warn that ts_headline() is expensive, but yes, the
specific example is obsolete.

regards, tom lane

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs