Re: tsearch comments

Started by Uros Gruberalmost 23 years ago5 messages
#1Uros Gruber
uros@sir-mag.com

Hi!

I think that this would be nice. OpenFTS is great, but i would
be great if this would be implement in real pg functions.

I think that indexim would be great if pg make it by itself.

Also it could be great if we could define order of weight of
columns.

bye Uros

I
On 28.01.2003 at 11:53:26, Oleg Bartunov <oleg@sai.msu.su>
wrote:

On Tue, 28 Jan 2003 sector119@mail.ru wrote:

HI

will we see sort by relevance at tsearch alpha version? :)

not sure. We concentrate our efforts, well, Teodor is working
on
better configurability of tsearch like OpenFTS does.

It\\\'s not difficult to add rather naive relevance based on
position
of lexem in document, for example. The question is do you

like

such
kind of relevancy ? Real ranking support (as in OpenFTS)
require
separate tables to maintain coordinate information.
We want to keep tsearch as simple as it\\\'s and now we just

add

Show quoted text

better and friendly configurability. Do we need complicate
tsearch ?
We already have OpenFTS which has most features people
requested.

#2Oleg Bartunov
oleg@sai.msu.su
In reply to: Uros Gruber (#1)

On Tue, 28 Jan 2003, Uros Gruber wrote:

Hi!

I think that this would be nice. OpenFTS is great, but i would
be great if this would be implement in real pg functions.

I think that indexim would be great if pg make it by itself.

Also it could be great if we could define order of weight of
columns.

Could you elaborate this ?

bye Uros

I
On 28.01.2003 at 11:53:26, Oleg Bartunov <oleg@sai.msu.su>
wrote:

On Tue, 28 Jan 2003 sector119@mail.ru wrote:

HI

will we see sort by relevance at tsearch alpha version? :)

not sure. We concentrate our efforts, well, Teodor is working
on
better configurability of tsearch like OpenFTS does.

It\\\'s not difficult to add rather naive relevance based on
position
of lexem in document, for example. The question is do you

like

such
kind of relevancy ? Real ranking support (as in OpenFTS)
require
separate tables to maintain coordinate information.
We want to keep tsearch as simple as it\\\'s and now we just

add

better and friendly configurability. Do we need complicate
tsearch ?
We already have OpenFTS which has most features people
requested.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#3eric@did-it.com
eric@did-it.com
In reply to: Oleg Bartunov (#2)

Hi,

I guess what we're looking for is something on the order (as much as I
hate using it as a reference) of MySQL's full text search which does
offer some ranking.

Just putting ranking alone in tsearch would be a huge benefit. Users can
then decide in their own language how to display results, especially
since those results may not necessarily require titles or description
fragments.

For example, we have several huge tables that have the following
columns:

id
tbltype
title
description

Basically, our customer will lookup words that are contained in title
and description, so we make an additional table like:

id
tblid (id of the source table)
tblsource (which table)
content (txtidx)

Then we can use tsearch to search the second table (we do now), and once
we retrieve the id's that we want, we can display results from one or
more source tables. Just putting in ranking in tsearch would solve all
these problems.

- Ericson Smith
http://www.did-it.com
http://www.weightlossfriends.com

Show quoted text

On Tue, 2003-01-28 at 14:00, Oleg Bartunov wrote:

On Tue, 28 Jan 2003, Uros Gruber wrote:

Hi!

I think that this would be nice. OpenFTS is great, but i would
be great if this would be implement in real pg functions.

I think that indexim would be great if pg make it by itself.

Also it could be great if we could define order of weight of
columns.

Could you elaborate this ?

bye Uros

I
On 28.01.2003 at 11:53:26, Oleg Bartunov <oleg@sai.msu.su>
wrote:

On Tue, 28 Jan 2003 sector119@mail.ru wrote:

HI

will we see sort by relevance at tsearch alpha version? :)

not sure. We concentrate our efforts, well, Teodor is working
on
better configurability of tsearch like OpenFTS does.

It\\\'s not difficult to add rather naive relevance based on
position
of lexem in document, for example. The question is do you

like

such
kind of relevancy ? Real ranking support (as in OpenFTS)
require
separate tables to maintain coordinate information.
We want to keep tsearch as simple as it\\\'s and now we just

add

better and friendly configurability. Do we need complicate
tsearch ?
We already have OpenFTS which has most features people
requested.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#4Oleg Bartunov
oleg@sai.msu.su
In reply to: eric@did-it.com (#3)

On 28 Jan 2003, eric@did-it.com wrote:

Hi,

I guess what we're looking for is something on the order (as much as I
hate using it as a reference) of MySQL's full text search which does
offer some ranking.

Just putting ranking alone in tsearch would be a huge benefit. Users can
then decide in their own language how to display results, especially
since those results may not necessarily require titles or description
fragments.

For example, we have several huge tables that have the following
columns:

id
tbltype
title
description

Basically, our customer will lookup words that are contained in title
and description, so we make an additional table like:

id
tblid (id of the source table)
tblsource (which table)
content (txtidx)

Then we can use tsearch to search the second table (we do now), and once
we retrieve the id's that we want, we can display results from one or
more source tables. Just putting in ranking in tsearch would solve all
these problems.

Hmm, people used to concatenation to get the same result. Do you really
need that table ? Your problem doesn't relate to ranking of results.

We could add some ranking support based on local (per-document) statistics.
Keeping global statistics, for example, TFxIDF, would complicate tsearch
and maintaining of indices. Proximity ranking as in OpenFTS require
more options in tsearch configuration. Let us think about ranking later
after we implement friendly interface.

- Ericson Smith
http://www.did-it.com
http://www.weightlossfriends.com

On Tue, 2003-01-28 at 14:00, Oleg Bartunov wrote:

On Tue, 28 Jan 2003, Uros Gruber wrote:

Hi!

I think that this would be nice. OpenFTS is great, but i would
be great if this would be implement in real pg functions.

I think that indexim would be great if pg make it by itself.

Also it could be great if we could define order of weight of
columns.

Could you elaborate this ?

bye Uros

I
On 28.01.2003 at 11:53:26, Oleg Bartunov <oleg@sai.msu.su>
wrote:

On Tue, 28 Jan 2003 sector119@mail.ru wrote:

HI

will we see sort by relevance at tsearch alpha version? :)

not sure. We concentrate our efforts, well, Teodor is working
on
better configurability of tsearch like OpenFTS does.

It\\\'s not difficult to add rather naive relevance based on
position
of lexem in document, for example. The question is do you

like

such
kind of relevancy ? Real ranking support (as in OpenFTS)
require
separate tables to maintain coordinate information.
We want to keep tsearch as simple as it\\\'s and now we just

add

better and friendly configurability. Do we need complicate
tsearch ?
We already have OpenFTS which has most features people
requested.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#5eric@did-it.com
eric@did-it.com
In reply to: Oleg Bartunov (#4)

Oleg,

We actually have several somewhat similar tables (A, B, C, D, E...) that
have some textual/varchar content. Thus we make a search table Z that
concatenates the textual info from the first tables. Sure, we could
probably use unions and such the like, but performance reasons prohibit
that scenario :-)

Its much better to search the search table, then show the relevant data
from the source tables based on ranked results.

- Ericson Smith

Show quoted text

On Wed, 2003-01-29 at 03:37, Oleg Bartunov wrote:

On 28 Jan 2003, eric@did-it.com wrote:

Hi,

I guess what we're looking for is something on the order (as much as I
hate using it as a reference) of MySQL's full text search which does
offer some ranking.

Just putting ranking alone in tsearch would be a huge benefit. Users can
then decide in their own language how to display results, especially
since those results may not necessarily require titles or description
fragments.

For example, we have several huge tables that have the following
columns:

id
tbltype
title
description

Basically, our customer will lookup words that are contained in title
and description, so we make an additional table like:

id
tblid (id of the source table)
tblsource (which table)
content (txtidx)

Then we can use tsearch to search the second table (we do now), and once
we retrieve the id's that we want, we can display results from one or
more source tables. Just putting in ranking in tsearch would solve all
these problems.

Hmm, people used to concatenation to get the same result. Do you really
need that table ? Your problem doesn't relate to ranking of results.

We could add some ranking support based on local (per-document) statistics.
Keeping global statistics, for example, TFxIDF, would complicate tsearch
and maintaining of indices. Proximity ranking as in OpenFTS require
more options in tsearch configuration. Let us think about ranking later
after we implement friendly interface.

- Ericson Smith
http://www.did-it.com
http://www.weightlossfriends.com

On Tue, 2003-01-28 at 14:00, Oleg Bartunov wrote:

On Tue, 28 Jan 2003, Uros Gruber wrote:

Hi!

I think that this would be nice. OpenFTS is great, but i would
be great if this would be implement in real pg functions.

I think that indexim would be great if pg make it by itself.

Also it could be great if we could define order of weight of
columns.

Could you elaborate this ?

bye Uros

I
On 28.01.2003 at 11:53:26, Oleg Bartunov <oleg@sai.msu.su>
wrote:

On Tue, 28 Jan 2003 sector119@mail.ru wrote:

HI

will we see sort by relevance at tsearch alpha version? :)

not sure. We concentrate our efforts, well, Teodor is working
on
better configurability of tsearch like OpenFTS does.

It\\\'s not difficult to add rather naive relevance based on
position
of lexem in document, for example. The question is do you

like

such
kind of relevancy ? Real ranking support (as in OpenFTS)
require
separate tables to maintain coordinate information.
We want to keep tsearch as simple as it\\\'s and now we just

add

better and friendly configurability. Do we need complicate
tsearch ?
We already have OpenFTS which has most features people
requested.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83