Improving Full text performance

Started by xaviergxfover 16 years ago5 messagesgeneral
Jump to latest
#1xaviergxf
xaviergxf@gmail.com

Hi,

I´m using php and full text on postgresql 8.3 for indexing html
descriptions. I have no acess to postgresql server, since i use a
shared hosting service.
To improve search and performance, i want to do the follow:

Strip all html tags then use my php script to remove more stop words
(because i can´t edit stop words file on the server).

My question: What i´m thinking to do, has any collateral effects? Any
suggestions?

Thanks!

#2ries van Twisk
pg@rvt.dds.nl
In reply to: xaviergxf (#1)
Re: Improving Full text performance

In these situations I would suggest to use a real (not that PG's FT is
not real...) search engine
like MNOGoSearch, lucene or others...

Ries

On Aug 21, 2009, at 9:56 PM, xaviergxf wrote:

Hi,

I´m using php and full text on postgresql 8.3 for indexing html
descriptions. I have no acess to postgresql server, since i use a
shared hosting service.
To improve search and performance, i want to do the follow:

Strip all html tags then use my php script to remove more stop words
(because i can´t edit stop words file on the server).

My question: What i´m thinking to do, has any collateral effects? Any
suggestions?

Thanks!

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

regards, Ries van Twisk

-------------------------------------------------------------------------------------------------
tags: Freelance TYPO3 Glassfish JasperReports JasperETL Flex Blaze-DS  
WebORB PostgreSQL DB-Architect
email: ries@vantwisk.nl        web:   http://www.rvantwisk.nl/     
skype: callto://r.vantwisk
Phone: +1-810-476-4196    Cell: +593 9901 7694                   SIP:  
+1-747-690-5133
#3Oleg Bartunov
oleg@sai.msu.su
In reply to: xaviergxf (#1)
Re: Improving Full text performance

On Fri, 21 Aug 2009, xaviergxf wrote:

Hi,

I?m using php and full text on postgresql 8.3 for indexing html
descriptions. I have no acess to postgresql server, since i use a
shared hosting service.
To improve search and performance, i want to do the follow:

Strip all html tags then use my php script to remove more stop words
(because i can?t edit stop words file on the server).

My question: What i?m thinking to do, has any collateral effects? Any
suggestions?

You shouldn't bother to strip all html tags, just create your own text search
configuration, which index only what do you want. Read documentation for
details.

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

#4xaviergxf
xaviergxf@gmail.com
In reply to: xaviergxf (#1)
Re: Improving Full text performance

If i strip all html tags and filter more stop words, will the search
be more accurate? Actually my fulltext stats returns some like: font
from <font> tags i guess, and other garbage.
If i do that, will i improve the speed of my search?

Thanks!

Ps: I cannot use other tools like MNOsearch, lucene, etc...because i
have no root pass to my server.

Show quoted text

On 22 ago, 02:20, o...@sai.msu.su (Oleg Bartunov) wrote:

On Fri, 21 Aug 2009, xaviergxf wrote:

Hi,

  I?m using php and full text on postgresql 8.3 for indexing html
descriptions. I have no acess to postgresql server, since i use a
shared hosting service.
   To improve search and performance, i want to do the follow:

Strip all html tags then use my php script to remove more stop words
(because i can?t edit stop words file on the server).

My question: What i?m thinking to do, has any collateral effects? Any
suggestions?

You shouldn't bother to strip all html tags, just create your own text search
configuration, which index only what do you want. Read documentation for
details.

        Regards,
                Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su,http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general

#5Oleg Bartunov
oleg@sai.msu.su
In reply to: xaviergxf (#4)
Re: Improving Full text performance

On Sat, 22 Aug 2009, xaviergxf wrote:

If i strip all html tags and filter more stop words, will the search
be more accurate? Actually my fulltext stats returns some like: font
from <font> tags i guess, and other garbage.
If i do that, will i improve the speed of my search?

What do you mean 'accurate' ? You need be yourself a bit more 'accurate'
when asking:) You need to provide more information about your problem.
For example, version of postgresql, size of collection you indexed,
explain analyze for your query, 'garbage' you got, etc.
This is not difficult - just copy'n paste work.

Thanks!

Ps: I cannot use other tools like MNOsearch, lucene, etc...because i
have no root pass to my server.

On 22 ago, 02:20, o...@sai.msu.su (Oleg Bartunov) wrote:

On Fri, 21 Aug 2009, xaviergxf wrote:

Hi,

=A0 I?m using php and full text on postgresql 8.3 for indexing html
descriptions. I have no acess to postgresql server, since i use a
shared hosting service.
=A0 =A0To improve search and performance, i want to do the follow:

Strip all html tags then use my php script to remove more stop words
(because i can?t edit stop words file on the server).

My question: What i?m thinking to do, has any collateral effects? Any
suggestions?

You shouldn't bother to strip all html tags, just create your own text se=

arch

configuration, which index only what do you want. Read documentation for
details.

=A0 =A0 =A0 =A0 Regards,
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su,http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
To make changes to your subscription:http://www.postgresql.org/mailpref/p=

gsql-general

--=20
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83