a little fix for text search
Hi there !
I don't know when exactly it was improved, but following notice in
https://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-HEADLINE
is currently not needed.
ts_headline uses the original document, not a tsvector summary, so it can
be slow and should be used with care. A typical mistake is to call
ts_headline for every matching document when only ten documents are to be
shown. SQL subqueries can help; here is an example:
Regards,
Oleg
Oleg Bartunov <obartunov@gmail.com> writes:
Hi there !
I don't know when exactly it was improved, but following notice in
https://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-HEADLINE
is currently not needed.
ts_headline uses the original document, not a tsvector summary, so it can
be slow and should be used with care. A typical mistake is to call
ts_headline for every matching document when only ten documents are to be
shown. SQL subqueries can help; here is an example:
I don't see why that stopped being appropriate? The point is that it
takes a raw text input which has to be re-parsed; that's still true
AFAICS.
regards, tom lane
--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs
On Sat, Nov 12, 2016 at 11:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Oleg Bartunov <obartunov@gmail.com> writes:
Hi there !
I don't know when exactly it was improved, but following notice in
https://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-HEADLINE
is currently not needed.
ts_headline uses the original document, not a tsvector summary, so it can
be slow and should be used with care. A typical mistake is to call
ts_headline for every matching document when only ten documents are to be
shown. SQL subqueries can help; here is an example:I don't see why that stopped being appropriate? The point is that it
takes a raw text input which has to be re-parsed; that's still true
AFAICS.
I mean that in the past we recommended to use subselect to avoid extra
ts_headline() call, which now, at least at 9.6, it's obsoleted and two sql
queries call ts_headline() exactly 5 times.
select ts_headline(body,to_tsquery('supernovae &
x-ray')),ts_rank(fts,to_tsquery('supernovae & x-ray')) as rank
from apod
where fts @@ to_tsquery('supernovae & x-ray') order by rank desc limit 5;
explain (analyze, costs off) select ts_headline(body,to_tsquery('supernovae
& x-ray')), rank from (
select body, ts_rank(fts,to_tsquery('supernovae & x-ray')) as rank from
apod where fts @@ to_tsquery('supernovae & x-ray')
order by rank desc limit 5
) as foo;
select ts_headline(body,to_tsquery('supernovae &
x-ray')),ts_rank(fts,to_tsquery('supernovae & x-ray')) as rank
from apod
where fts @@ to_tsquery('supernovae & x-ray') order by rank desc limit 5;
Show quoted text
regards, tom lane
Oleg Bartunov <obartunov@gmail.com> writes:
On Sat, Nov 12, 2016 at 11:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I don't see why that stopped being appropriate? The point is that it
takes a raw text input which has to be re-parsed; that's still true
AFAICS.
I mean that in the past we recommended to use subselect to avoid extra
ts_headline() call, which now, at least at 9.6, it's obsoleted and two sql
queries call ts_headline() exactly 5 times.
Oh, I see your point: commit 9118d03a8 fixed the planner so you don't get
extra evaluations of ts_headline() in this example. I think it's probably
still appropriate to warn that ts_headline() is expensive, but yes, the
specific example is obsolete.
regards, tom lane
--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs