creating index on changed field type

Started by David Smithabout 22 years ago3 messageshackers
Jump to latest
#1David Smith
gegez-pgh@instytut.com.pl

Hello,
the subject is obscure, so I will try to explain. I would like to
develop index based on text field (or tsvector stolen from tsearch2),
but containing different type (for example cstring, varchar,etc.) in
order to tokenize the original field. I would like to use postgresql
btree implementation, but AFAICS I can not do it. Example:

CREATE TABLE test (id int, mytext text);
CREATE INDEX myindex on test USING myindex (mytext) ;
INSERT INTO test VALUES(1,'this is my first text');

In index I do not want to keep whole phrase, but words derived from it
('this', 'is', 'my', 'first', 'text').

My idea was to create functions mybtgettuple, mybtinsert, mybtbeginscan
, mybtrescan and so on. And in every case ignoring original
IndexTuple, and create set of new IndexTuple's (one for every term) and
involving original functiions.

The problem is that index_create() in catalog/index.c creates everything
in system tables, especially type of index field.

Should I forget about btrees and move to GIST, or is there any hack,
which could solve my problem? Please help me.

Thanks in advance,
David

ps. maybe I should create index on TEXT field, store terms (words form
the original field) also as TEXT type? Will it work?

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Smith (#1)
Re: creating index on changed field type

David Smith <gegez-pgh@instytut.com.pl> writes:

Should I forget about btrees and move to GIST,

Yes. There's no provision in the btree code for an index storage type
different from the column datatype.

regards, tom lane

#3David Smith
gegez-pgh@instytut.com.pl
In reply to: Tom Lane (#2)
Re: creating index on changed field type

U�ytkownik Tom Lane napisa�:

David Smith <gegez-pgh@instytut.com.pl> writes:

Should I forget about btrees and move to GIST,

Yes. There's no provision in the btree code for an index storage type
different from the column datatype.

regards, tom lane

Thank You for reply.
Let us suppose, the we retain type of field (column). But instead of
storing original value(key), we will store tokens(Instead 'this is my
first text', we would keep 5 tokens (5 different BTItems) respectively:
'this', 'is', 'my', 'first', 'text'). Will it work or is there any other
catch I can not see.

My performance tests resulted that GIST would be slower than original
btree index. Maybe I mistaken somehow...

Best regards,
David