tsearch vs. fulltextindex

Started by Christopher Kings-Lynneover 23 years ago4 messages
#1Christopher Kings-Lynne
chriskl@familyhealth.com.au

Hi,

I've just done some performance comparisons between contrib/fulltextindex
and contrib/tsearch. Even with every optimisation I can think of for
fulltextindex, tsearch is 300 times faster ;)

Plus it doesn't require a separate table or complicated queries.

I think we should strongly encourage people to use tsearch instead of
fulltextindex. I hope to commit some change to fulltextindex in the near
future, so I'll add a note to the readme then.

Chris

eg:

australia=# explain analyse select food_id, category_id, description from
test_foods where not pending and fulltextidx ## 'baskin&fruit';
NOTICE: QUERY PLAN:

Index Scan using fulltextidx_idx on test_foods (cost=0.00..45.93 rows=11
width=40) (actual time=0.22..1.53 rows=8 loops=1)
Total runtime: 1.70 msec

EXPLAIN
australia=# explain analyze SELECT distinct(f.food_id), f.category_id,
f.description, f.brand FROM food_foods f, food_foods_fti f0, food_foods_fti
f1 WHERE NOT f.pending AND f0.id=f.oid AND f0.string ~ '^baskin' AND
f1.id=f.oid AND f1.string ~ '^fruit';
NOTICE: QUERY PLAN:

Unique (cost=12.10..12.11 rows=1 width=66) (actual time=532.11..532.25
rows=8 loops=1)
-> Sort (cost=12.10..12.10 rows=1 width=66) (actual time=532.10..532.14
rows=8 loops=1)
-> Nested Loop (cost=0.00..12.09 rows=1 width=66) (actual
time=292.41..531.89 rows=8 loops=1)
-> Nested Loop (cost=0.00..6.07 rows=1 width=8) (actual
time=292.35..531.35 rows=8 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f0 (cost=0.00..3.03 rows=1 width=4) (actual time=0.07..0.45
rows=23 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f1 (cost=0.00..3.03 rows=1 width=4) (actual time=0.04..16.52
rows=1092 loops=23)
-> Index Scan using food_foods_oid_idx on food_foods f
(cost=0.00..6.01 rows=1 width=58) (actual time=0.03..0.04 rows=1 loops=8)
Total runtime: 532.49 msec

EXPLAIN

#2Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Christopher Kings-Lynne (#1)
Re: [HACKERS] tsearch vs. fulltextindex

Good point. Some said fulltextindex was better for certain queries, but
if no one can come up with such a case, we can remove it.

---------------------------------------------------------------------------

Christopher Kings-Lynne wrote:

Hi,

I've just done some performance comparisons between contrib/fulltextindex
and contrib/tsearch. Even with every optimisation I can think of for
fulltextindex, tsearch is 300 times faster ;)

Plus it doesn't require a separate table or complicated queries.

I think we should strongly encourage people to use tsearch instead of
fulltextindex. I hope to commit some change to fulltextindex in the near
future, so I'll add a note to the readme then.

Chris

eg:

australia=# explain analyse select food_id, category_id, description from
test_foods where not pending and fulltextidx ## 'baskin&fruit';
NOTICE: QUERY PLAN:

Index Scan using fulltextidx_idx on test_foods (cost=0.00..45.93 rows=11
width=40) (actual time=0.22..1.53 rows=8 loops=1)
Total runtime: 1.70 msec

EXPLAIN
australia=# explain analyze SELECT distinct(f.food_id), f.category_id,
f.description, f.brand FROM food_foods f, food_foods_fti f0, food_foods_fti
f1 WHERE NOT f.pending AND f0.id=f.oid AND f0.string ~ '^baskin' AND
f1.id=f.oid AND f1.string ~ '^fruit';
NOTICE: QUERY PLAN:

Unique (cost=12.10..12.11 rows=1 width=66) (actual time=532.11..532.25
rows=8 loops=1)
-> Sort (cost=12.10..12.10 rows=1 width=66) (actual time=532.10..532.14
rows=8 loops=1)
-> Nested Loop (cost=0.00..12.09 rows=1 width=66) (actual
time=292.41..531.89 rows=8 loops=1)
-> Nested Loop (cost=0.00..6.07 rows=1 width=8) (actual
time=292.35..531.35 rows=8 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f0 (cost=0.00..3.03 rows=1 width=4) (actual time=0.07..0.45
rows=23 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f1 (cost=0.00..3.03 rows=1 width=4) (actual time=0.04..16.52
rows=1092 loops=23)
-> Index Scan using food_foods_oid_idx on food_foods f
(cost=0.00..6.01 rows=1 width=58) (actual time=0.03..0.04 rows=1 loops=8)
Total runtime: 532.49 msec

EXPLAIN

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#3Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Bruce Momjian (#2)
Re: [HACKERS] tsearch vs. fulltextindex

Well, I think it shouldn't disappear for a few releases yet...

Chris

Show quoted text

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Bruce Momjian
Sent: Wednesday, 14 August 2002 12:43 PM
To: Christopher Kings-Lynne
Cc: Hackers; pgsql-general@postgresql.org; pgsql-sql@postgresql.org
Subject: Re: [HACKERS] tsearch vs. fulltextindex

Good point. Some said fulltextindex was better for certain queries, but
if no one can come up with such a case, we can remove it.

------------------------------------------------------------------
---------

Christopher Kings-Lynne wrote:

Hi,

I've just done some performance comparisons between

contrib/fulltextindex

and contrib/tsearch. Even with every optimisation I can think of for
fulltextindex, tsearch is 300 times faster ;)

Plus it doesn't require a separate table or complicated queries.

I think we should strongly encourage people to use tsearch instead of
fulltextindex. I hope to commit some change to fulltextindex

in the near

future, so I'll add a note to the readme then.

Chris

eg:

australia=# explain analyse select food_id, category_id,

description from

test_foods where not pending and fulltextidx ## 'baskin&fruit';
NOTICE: QUERY PLAN:

Index Scan using fulltextidx_idx on test_foods

(cost=0.00..45.93 rows=11

width=40) (actual time=0.22..1.53 rows=8 loops=1)
Total runtime: 1.70 msec

EXPLAIN
australia=# explain analyze SELECT distinct(f.food_id), f.category_id,
f.description, f.brand FROM food_foods f, food_foods_fti f0,

food_foods_fti

f1 WHERE NOT f.pending AND f0.id=f.oid AND f0.string ~ '^baskin' AND
f1.id=f.oid AND f1.string ~ '^fruit';
NOTICE: QUERY PLAN:

Unique (cost=12.10..12.11 rows=1 width=66) (actual time=532.11..532.25
rows=8 loops=1)
-> Sort (cost=12.10..12.10 rows=1 width=66) (actual

time=532.10..532.14

rows=8 loops=1)
-> Nested Loop (cost=0.00..12.09 rows=1 width=66) (actual
time=292.41..531.89 rows=8 loops=1)
-> Nested Loop (cost=0.00..6.07 rows=1 width=8) (actual
time=292.35..531.35 rows=8 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f0 (cost=0.00..3.03 rows=1 width=4) (actual

time=0.07..0.45

rows=23 loops=1)
-> Index Scan using food_foods_fti_string_idx on
food_foods_fti f1 (cost=0.00..3.03 rows=1 width=4) (actual

time=0.04..16.52

rows=1092 loops=23)
-> Index Scan using food_foods_oid_idx on food_foods f
(cost=0.00..6.01 rows=1 width=58) (actual time=0.03..0.04

rows=1 loops=8)

Total runtime: 532.49 msec

EXPLAIN

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 359-1001
+  If your life is a hard drive,     |  13 Roberts Road
+  Christ can be your backup.        |  Newtown Square,
Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#4Graeme Merrall
gbmerrall@aol.com
In reply to: Christopher Kings-Lynne (#1)
Re: tsearch vs. fulltextindex

I've just done some performance comparisons between contrib/fulltextindex
and contrib/tsearch. Even with every optimisation I can think of for
fulltextindex, tsearch is 300 times faster ;)

Plus it doesn't require a separate table or complicated queries.

I think we should strongly encourage people to use tsearch instead of
fulltextindex. I hope to commit some change to fulltextindex in the near
future, so I'll add a note to the readme then.

Ditto. We just replaced Oracle Context Search with tsearch and it's just as
good for what we were using it for and nicer to play with.
There's no scoring as far as I can see but I can live with that.

(and hello fellow aussie)

Cheers,
Graeme