Boolean text, with phrase ranking, search under Postgres

Started by mlwover 25 years ago5 messagesgeneral
Jump to latest
#1mlw
markw@mohawksoft.com

I am working on a GPL version of a boolean text search engine for
PostgreSQL.

How it works:

You run a program which executes a query and builds a set of external
indexes.

Then you run a daemon process which does the processing of the text
query.

In postgres, you create a temporary table of results, call textsearch
which populates the table,
lastly, you join with the results table. The code looks like this:

--
-- Create a temporary table for search results
--
create temp table search_result (key integer, rank integer);
--
-- Call search daemon to populate table
--
select textsearch('performer2{ waitresses } song { i know what boys like
}');
--
-- Join result table with data tables
--
select title, song, performer2, rank from zsong, ztitles, search_result
where search_result.key = zsong.trackid and
zsong.muzenbr = ztitles.muzenbr
order by search_result.rank desc;
--
-- Finished with result table, drop it.
--
drop table search_result ;
<<<<<<<<<<<<<<<<

he textsearch function looks like:

create function textsearch(varchar) returns integer as
'
DECLARE
handle integer;
count integer;
pos integer;
BEGIN
handle = search_exec( \'localhost\', $1);

count = search_count(handle);

for pos in 0 .. count-1 loop
insert into search_result(key, rank)
values (search_key(handle,pos),
search_rank(handle,pos));
end loop;

return search_done(handle);

END;
' language 'plpgsql';

What I would like to do is create the result table in the function
and/or accept a table name as a parameter. I can't seem to do this,
perhaps I am missing something trivial.

Is there a way to create a table from within 'c' and return it? I am
not a postgres pro, I am a C/C++ guy and do not know the ins and outs of
Postgres, and it should be a lot easier to make something more eficient.
Any insign would be appreciated.

#2Edmar Wiggers
edmar@brasmap.com
In reply to: mlw (#1)
C function returning rows, was Boolean text, with phrase ranking, search under Postgres

I too am interested in full text indexing under PostgreSQL.

Moreover, I have run into the same problem as you (in a different project
though): how do you return more than 1 value from a C function? I'd like to
return "rows", but, if at all possible, NOT to create a table. How about a
cursor? Perhaps one could then do something such as

insert into new_table select textsearch('bla...');

Does anyone know if it's possible?

-----Original Message-----
From: markw <markw@mohawksoft.com>
Sent: Monday, October 16, 2000 3:33 PM
To: pgsql-announce@postgresql.org,pgsql-general@postgresql.org
Subject: [ANNOUNCE] Boolean text, with phrase ranking, search under
Postgres

I am working on a GPL version of a boolean text search engine for
PostgreSQL.

How it works:

You run a program which executes a query and builds a set of external
indexes.

Then you run a daemon process which does the processing of the text
query.

In postgres, you create a temporary table of results, call textsearch
which populates the table,
lastly, you join with the results table. The code looks like this:

--
-- Create a temporary table for search results
--
create temp table search_result (key integer, rank integer);
--
-- Call search daemon to populate table
--
select textsearch('performer2{ waitresses } song { i know what boys like
}');

... (cut) ...

Show quoted text

What I would like to do is create the result table in the function
and/or accept a table name as a parameter. I can't seem to do this,
perhaps I am missing something trivial.

Is there a way to create a table from within 'c' and return it? I am
not a postgres pro, I am a C/C++ guy and do not know the ins and outs of
Postgres, and it should be a lot easier to make something more eficient.
Any insign would be appreciated.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Edmar Wiggers (#2)
Re: C function returning rows, was Boolean text, with phrase ranking, search under Postgres

"Edmar Wiggers" <edmar@brasmap.com> writes:

Moreover, I have run into the same problem as you (in a different project
though): how do you return more than 1 value from a C function?

In current releases, a C function cannot return a set; the expression
evaluator has a hard-wired notion that only SQL-language functions can
return sets.

This is fixed in current development sources (7.1-to-be). How to do it
might even be documented by the time 7.1 gets out, though it's not yet
:-(

regards, tom lane

#4mlw
markw@mohawksoft.com
In reply to: Edmar Wiggers (#2)
Re: C function returning rows, was Boolean text, with phrase ranking, search under Postgres

Edmar Wiggers wrote:

I too am interested in full text indexing under PostgreSQL.

As I have described it, does it sound like something you would use? It
is designed more like a search engine. It will do a full text / boolean
search with phrase ranking in about 10~40 ms depending on the number of
terms, memory and CPU speed.

I need to return or, as it is implemented, fill a table with the index
keys as entries in a table to preserve rank.

BTW: it also does a modified metaphone parsing on text.

Moreover, I have run into the same problem as you (in a different project
though): how do you return more than 1 value from a C function? I'd like to
return "rows", but, if at all possible, NOT to create a table. How about a
cursor? Perhaps one could then do something such as

insert into new_table select textsearch('bla...');

I would like this API best:

create temp table search_results as select textsearch('bla bla');

If I can do this, it will be very cool.

Does anyone know if it's possible?

-----Original Message-----
From: markw <markw@mohawksoft.com>
Sent: Monday, October 16, 2000 3:33 PM
To: pgsql-announce@postgresql.org,pgsql-general@postgresql.org
Subject: [ANNOUNCE] Boolean text, with phrase ranking, search under
Postgres

I am working on a GPL version of a boolean text search engine for
PostgreSQL.

How it works:

You run a program which executes a query and builds a set of external
indexes.

Then you run a daemon process which does the processing of the text
query.

In postgres, you create a temporary table of results, call textsearch
which populates the table,
lastly, you join with the results table. The code looks like this:

--
-- Create a temporary table for search results
--
create temp table search_result (key integer, rank integer);
--
-- Call search daemon to populate table
--
select textsearch('performer2{ waitresses } song { i know what boys like
}');

... (cut) ...

What I would like to do is create the result table in the function
and/or accept a table name as a parameter. I can't seem to do this,
perhaps I am missing something trivial.

Is there a way to create a table from within 'c' and return it? I am
not a postgres pro, I am a C/C++ guy and do not know the ins and outs of
Postgres, and it should be a lot easier to make something more eficient.
Any insign would be appreciated.

--
http://www.mohawksoft.com

#5Mitch Vincent
mitch@venux.net
In reply to: Edmar Wiggers (#2)
Re: Re: C function returning rows, was Boolean text, with phrase ranking, search under Postgres

Edmar Wiggers wrote:

I too am interested in full text indexing under PostgreSQL.

As I have described it, does it sound like something you would use? It
is designed more like a search engine. It will do a full text / boolean
search with phrase ranking in about 10~40 ms depending on the number of
terms, memory and CPU speed.

Does this actually exist or is it still in the design phase of development?

-Mitch