tsearch2: very slow queries

Started by The Hermit Hackeralmost 21 years ago4 messagesgeneral
Jump to latest
#1The Hermit Hacker
scrappy@hub.org

'k, I'm obviously doing something wrong, since my experiences with sites
like fts.postgresql.org indicate things should be *alot* faster then I'm
getting ...

I have a *very* simple table:

=# \d article_tsearch
Table "public.article_tsearch"
Column | Type | Modifiers
------------+----------+-----------
article_id | integer |
idxft1 | tsvector |
Indexes:
"at_idxft1_idx" gist (idxft1)

rblog=# select count(1) from article_tsearch;
count
--------
643072
(1 row)

rblog=# select count(1) from article_tsearch where idxFT1 @@ to_tsquery('1&dvd');;
count
-------
1681
(1 row)

But, it just seems to take so long to do the query itself:

# explain analyze select * from article_tsearch where idxFT1 @@ to_tsquery('1&dvd') order by rank(idxFT1, to_tsquery('1&dvd')) desc limit 26;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=2625.53..2625.60 rows=26 width=36) (actual time=20164.262..20164.597 rows=26 loops=1)
-> Sort (cost=2625.53..2627.14 rows=644 width=36) (actual time=20164.257..20164.298 rows=26 loops=1)
Sort Key: rank(idxft1, '\'1\' & \'dvd\''::tsquery)
-> Index Scan using at_idxft1_idx on article_tsearch (cost=0.00..2595.48 rows=644 width=36) (actual time=29.476..20153.530 rows=1681 loops=1)
Index Cond: (idxft1 @@ '\'1\' & \'dvd\''::tsquery)
Filter: (idxft1 @@ '\'1\' & \'dvd\''::tsquery)
Total runtime: 20166.326 ms
(7 rows)

If it is, then I'm obviously overlooking something key here ... now, I've
read through the docs in contrib/tsearch2/docs, and don't *think* I've
missed anything obvious ... it seems fairly straightforward ...

Is there something else I should be doing to speed the query up any? Or
is this fairly normal?

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664

#2Joshua D. Drake
jd@commandprompt.com
In reply to: The Hermit Hacker (#1)
Re: tsearch2: very slow queries

Marc G. Fournier wrote:

'k, I'm obviously doing something wrong, since my experiences with sites
like fts.postgresql.org indicate things should be *alot* faster then I'm
getting ...

Well the first thing I would ask is are you running 8.0? My testing
shows that Tsearch is pretty abysmal if you are not running 8.0. At
least with very large tables.

I have a *very* simple table:

=# \d article_tsearch
Table "public.article_tsearch"
Column | Type | Modifiers
------------+----------+-----------
article_id | integer |
idxft1 | tsvector |
Indexes:
"at_idxft1_idx" gist (idxft1)

rblog=# select count(1) from article_tsearch;
count
--------
643072
(1 row)

Is there something else I should be doing to speed the query up any? Or
is this fairly normal?

Considering the number of rows I am not that surprised but I would be
curious to know what type of HD you have? Also correct me if I am wrong
but gist indexes are typically very large. Do you have enough
work_mem/sort_mem to keep them from going to disk?

Sincerely,

Joshua D. Drake

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org

#3The Hermit Hacker
scrappy@hub.org
In reply to: Joshua D. Drake (#2)
Re: tsearch2: very slow queries

On Sun, 7 Aug 2005, Joshua D. Drake wrote:

Marc G. Fournier wrote:

'k, I'm obviously doing something wrong, since my experiences with sites
like fts.postgresql.org indicate things should be *alot* faster then I'm
getting ...

Well the first thing I would ask is are you running 8.0? My testing
shows that Tsearch is pretty abysmal if you are not running 8.0. At
least with very large tables.

This is one thing I was fearing, especially with the work that Teodor and
gang have been putting into it for 8.1 :( Unfortunately, we're currently
stuck with 7.4.6 for this, so that is one thing I'm going to have to take
into consideration ...

Considering the number of rows I am not that surprised but I would be
curious to know what type of HD you have? Also correct me if I am wrong
but gist indexes are typically very large. Do you have enough
work_mem/sort_mem to keep them from going to disk?

I'm currently playing in a non-production environment (ie. my desktop
machine) just to get a feel for things ... our main server for this is a
proper 4G of RAM, sort_mem bump'd up quite nicely, and file system spread
over multiple spindles ...

Right now, I'm just playing with / learning the tsearch stuff, so am more
looking at a 'this is the worst case scenario on my box', and this
improves things ... not perfect, but anything I can improve here, I know
will be easier to improve on the production server :)

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664

#4Oleg Bartunov
oleg@sai.msu.su
In reply to: The Hermit Hacker (#3)
Re: tsearch2: very slow queries

Hi there,

tsearch2 is our next problem we plan to attack after we have done with
GiST core. Actually, we did some experiments in background and we're sure we
could very greatly improve tsearch2 performance and add a lot of nice
features. Most probably, we'll call for fund raising for this project
as soon as we find out how to reliably transfer money to us.

Oleg

On Sun, 7 Aug 2005, Marc G. Fournier wrote:

On Sun, 7 Aug 2005, Joshua D. Drake wrote:

Marc G. Fournier wrote:

'k, I'm obviously doing something wrong, since my experiences with sites
like fts.postgresql.org indicate things should be *alot* faster then I'm
getting ...

Well the first thing I would ask is are you running 8.0? My testing shows
that Tsearch is pretty abysmal if you are not running 8.0. At least with
very large tables.

This is one thing I was fearing, especially with the work that Teodor and
gang have been putting into it for 8.1 :( Unfortunately, we're currently
stuck with 7.4.6 for this, so that is one thing I'm going to have to take
into consideration ...

Considering the number of rows I am not that surprised but I would be
curious to know what type of HD you have? Also correct me if I am wrong but
gist indexes are typically very large. Do you have enough work_mem/sort_mem
to keep them from going to disk?

I'm currently playing in a non-production environment (ie. my desktop
machine) just to get a feel for things ... our main server for this is a
proper 4G of RAM, sort_mem bump'd up quite nicely, and file system spread
over multiple spindles ...

Right now, I'm just playing with / learning the tsearch stuff, so am more
looking at a 'this is the worst case scenario on my box', and this improves
things ... not perfect, but anything I can improve here, I know will be
easier to improve on the production server :)

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83