Indexing on arrays

Started by Ivan E. Panchenkoabout 25 years ago2 messages
#1Ivan E. Panchenko
ivan@xray.sai.msu.ru

Dear Hackers,

While working on a postgres-based fulltext searching system
we encountered the following problem:

There is a table
create table t (
x int []
)
and a given integer constant y.
The task is to find those records of this table, which contain the
value y in the arrays x.

This could be implemented by writing a function
array_contains(array,value)
and selecting :
select * from table where array_contains(table.x, y);

Such SQL statement would result in a long sequential scan, which is not
cool and not fast. It could be much more useful if we could use an index
for such a query.

If there were a kind of B-tree index which allows to have several
key values for a record, the problem could be solved easily!

We would like to know if such a feature is already implemented
in postgres indexes, otherwise are there any serious difficulties in
implementing it.

May be, GiSt could be useful for this task. Does anybody know any alive
implementation of GiST ?

Regards, Ivan Panchenko

#2mlw
markw@mohawksoft.com
In reply to: Ivan E. Panchenko (#1)
Re: Indexing on arrays

I am also working on a full text search system, and I have a similar
problem, although I can get around the full table scan if I can simply
return a full set of tuples.

select * from table where table.key in ( function('bla bla bla') );

Or

create table result as function('bla bla bla');

select * from table where table.key = result.key;

I have been trying to figure out how to return a variable number and
format of tuples, but am getting lost in the code. Any help anyone has
would be greatly appreciated.

"Ivan E. Panchenko" wrote:

Dear Hackers,

While working on a postgres-based fulltext searching system
we encountered the following problem:

There is a table
create table t (
x int []
)
and a given integer constant y.
The task is to find those records of this table, which contain the
value y in the arrays x.

This could be implemented by writing a function
array_contains(array,value)
and selecting :
select * from table where array_contains(table.x, y);

Such SQL statement would result in a long sequential scan, which is not
cool and not fast. It could be much more useful if we could use an index
for such a query.

If there were a kind of B-tree index which allows to have several
key values for a record, the problem could be solved easily!

We would like to know if such a feature is already implemented
in postgres indexes, otherwise are there any serious difficulties in
implementing it.

May be, GiSt could be useful for this task. Does anybody know any alive
implementation of GiST ?

Regards, Ivan Panchenko

--
http://www.mohawksoft.com