tsearch vs. fulltextindex
Hi,
I've just done some performance comparisons between contrib/fulltextindex
and contrib/tsearch. Even with every optimisation I can think of for
fulltextindex, tsearch is 300 times faster ;)
Plus it doesn't require a separate table or complicated queries.
I think we should strongly encourage people to use tsearch instead of
fulltextindex. I hope to commit some change to fulltextindex in the near
future, so I'll add a note to the readme then.
Chris
eg:
australia=# explain analyse select food_id, category_id, description from
test_foods where not pending and fulltextidx ## 'baskin&fruit';
NOTICE: QUERY PLAN:
Index Scan using fulltextidx_idx on test_foods (cost=0.00..45.93 rows=11
width=40) (actual time=0.22..1.53 rows=8 loops=1)
Total runtime: 1.70 msec
EXPLAIN
australia=# explain analyze SELECT distinct(f.food_id), f.category_id,
f.description, f.brand FROM food_foods f, food_foods_fti f0, food_foods_fti
f1 WHERE NOT f.pending AND f0.id=f.oid AND f0.string ~ '^baskin' AND
f1.id=f.oid AND f1.string ~ '^fruit';
NOTICE: QUERY PLAN:
Unique (cost=12.10..12.11 rows=1 width=66) (actual time=532.11..532.25
rows=8 loops=1)
-> Sort (cost=12.10..12.10 rows=1 width=66) (actual time=532.10..532.14
rows=8 loops=1)
-> Nested Loop (cost=0.00..12.09 rows=1 width=66) (actual
time=292.41..531.89 rows=8 loops=1)
-> Nested Loop (cost=0.00..6.07 rows=1 width=8) (actual
time=292.35..531.35 rows=8 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f0 (cost=0.00..3.03 rows=1 width=4) (actual time=0.07..0.45
rows=23 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f1 (cost=0.00..3.03 rows=1 width=4) (actual time=0.04..16.52
rows=1092 loops=23)
-> Index Scan using food_foods_oid_idx on food_foods f
(cost=0.00..6.01 rows=1 width=58) (actual time=0.03..0.04 rows=1 loops=8)
Total runtime: 532.49 msec
EXPLAIN
Good point. Some said fulltextindex was better for certain queries, but
if no one can come up with such a case, we can remove it.
---------------------------------------------------------------------------
Christopher Kings-Lynne wrote:
Hi,
I've just done some performance comparisons between contrib/fulltextindex
and contrib/tsearch. Even with every optimisation I can think of for
fulltextindex, tsearch is 300 times faster ;)Plus it doesn't require a separate table or complicated queries.
I think we should strongly encourage people to use tsearch instead of
fulltextindex. I hope to commit some change to fulltextindex in the near
future, so I'll add a note to the readme then.Chris
eg:
australia=# explain analyse select food_id, category_id, description from
test_foods where not pending and fulltextidx ## 'baskin&fruit';
NOTICE: QUERY PLAN:Index Scan using fulltextidx_idx on test_foods (cost=0.00..45.93 rows=11
width=40) (actual time=0.22..1.53 rows=8 loops=1)
Total runtime: 1.70 msecEXPLAIN
australia=# explain analyze SELECT distinct(f.food_id), f.category_id,
f.description, f.brand FROM food_foods f, food_foods_fti f0, food_foods_fti
f1 WHERE NOT f.pending AND f0.id=f.oid AND f0.string ~ '^baskin' AND
f1.id=f.oid AND f1.string ~ '^fruit';
NOTICE: QUERY PLAN:Unique (cost=12.10..12.11 rows=1 width=66) (actual time=532.11..532.25
rows=8 loops=1)
-> Sort (cost=12.10..12.10 rows=1 width=66) (actual time=532.10..532.14
rows=8 loops=1)
-> Nested Loop (cost=0.00..12.09 rows=1 width=66) (actual
time=292.41..531.89 rows=8 loops=1)
-> Nested Loop (cost=0.00..6.07 rows=1 width=8) (actual
time=292.35..531.35 rows=8 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f0 (cost=0.00..3.03 rows=1 width=4) (actual time=0.07..0.45
rows=23 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f1 (cost=0.00..3.03 rows=1 width=4) (actual time=0.04..16.52
rows=1092 loops=23)
-> Index Scan using food_foods_oid_idx on food_foods f
(cost=0.00..6.01 rows=1 width=58) (actual time=0.03..0.04 rows=1 loops=8)
Total runtime: 532.49 msecEXPLAIN
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Well, I think it shouldn't disappear for a few releases yet...
Chris
Show quoted text
-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Bruce Momjian
Sent: Wednesday, 14 August 2002 12:43 PM
To: Christopher Kings-Lynne
Cc: Hackers; pgsql-general@postgresql.org; pgsql-sql@postgresql.org
Subject: Re: [HACKERS] tsearch vs. fulltextindexGood point. Some said fulltextindex was better for certain queries, but
if no one can come up with such a case, we can remove it.------------------------------------------------------------------
---------Christopher Kings-Lynne wrote:
Hi,
I've just done some performance comparisons between
contrib/fulltextindex
and contrib/tsearch. Even with every optimisation I can think of for
fulltextindex, tsearch is 300 times faster ;)Plus it doesn't require a separate table or complicated queries.
I think we should strongly encourage people to use tsearch instead of
fulltextindex. I hope to commit some change to fulltextindexin the near
future, so I'll add a note to the readme then.
Chris
eg:
australia=# explain analyse select food_id, category_id,
description from
test_foods where not pending and fulltextidx ## 'baskin&fruit';
NOTICE: QUERY PLAN:Index Scan using fulltextidx_idx on test_foods
(cost=0.00..45.93 rows=11
width=40) (actual time=0.22..1.53 rows=8 loops=1)
Total runtime: 1.70 msecEXPLAIN
australia=# explain analyze SELECT distinct(f.food_id), f.category_id,
f.description, f.brand FROM food_foods f, food_foods_fti f0,food_foods_fti
f1 WHERE NOT f.pending AND f0.id=f.oid AND f0.string ~ '^baskin' AND
f1.id=f.oid AND f1.string ~ '^fruit';
NOTICE: QUERY PLAN:Unique (cost=12.10..12.11 rows=1 width=66) (actual time=532.11..532.25
rows=8 loops=1)
-> Sort (cost=12.10..12.10 rows=1 width=66) (actualtime=532.10..532.14
rows=8 loops=1)
-> Nested Loop (cost=0.00..12.09 rows=1 width=66) (actual
time=292.41..531.89 rows=8 loops=1)
-> Nested Loop (cost=0.00..6.07 rows=1 width=8) (actual
time=292.35..531.35 rows=8 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f0 (cost=0.00..3.03 rows=1 width=4) (actualtime=0.07..0.45
rows=23 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f1 (cost=0.00..3.03 rows=1 width=4) (actualtime=0.04..16.52
rows=1092 loops=23)
-> Index Scan using food_foods_oid_idx on food_foods f
(cost=0.00..6.01 rows=1 width=58) (actual time=0.03..0.04rows=1 loops=8)
Total runtime: 532.49 msec
EXPLAIN
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
I've just done some performance comparisons between contrib/fulltextindex
and contrib/tsearch. Even with every optimisation I can think of for
fulltextindex, tsearch is 300 times faster ;)Plus it doesn't require a separate table or complicated queries.
I think we should strongly encourage people to use tsearch instead of
fulltextindex. I hope to commit some change to fulltextindex in the near
future, so I'll add a note to the readme then.
Ditto. We just replaced Oracle Context Search with tsearch and it's just as
good for what we were using it for and nicer to play with.
There's no scoring as far as I can see but I can live with that.
(and hello fellow aussie)
Cheers,
Graeme