limit over attribute size if index over it exists

Started by pajaialmost 20 years ago3 messagesgeneral

patrick.jayet@gmail.com

almost 20 years ago

Hi everybody,

I have got an issue with PostgreSQL. There is a limitation on the
column length of a tuple, in case there is an index over it. In the
actual project I am working on, I meet such a situation. I have got an
attribute over which I am doing a search (that is, I need an index over
it), but this attribute can be in some cases very large (100KB+).

The log message I get from Postgres, if I try to insert a tuple with
such a big attribute (e.g. 10K) is the following:

ERROR: index row requires 15704 bytes, maximum size is 8191

(PostgreSQL 8.07 under Linux. The index is a btree index.)

I have thought of a possible workaround. I would like to know if it
seems reasonable. The idea would be to build a hash, on the client
side, over the problematic column (let's say column a). I then store in
the db the attribute a (without index) and the hash(a) (with an index).
Then when I am doing a select, I use firstly a sub-select to choose all
tuples with the right hash (quick, with index), and then an outer
select to choose the tuple with the right attribute a (slow, sequential
scan, but normally few tuples, because few collisions). Something like
that:

SELECT b
FROM (
SELECT a, b
FROM foo
WHERE hash='<hash(a)>'
) as bar
WHERE bar.a='<a>'

(Actually, in my case the situation is slightly more complicated
because I don't have just one attribute but 2+, so there are some index
types that I cannot use. Anyway the principle is the same).

Does this solution seem reasonable, or is there other (more elegant)
ways to do that?

Thank you in advance.
Cheers,

Pat

Martijn van Oosterhout

kleptog@svana.org

almost 20 years ago

In reply to: pajai (#1)

Re: limit over attribute size if index over it exists

On Mon, Jun 26, 2006 at 02:52:56AM -0700, pajai wrote:

I have thought of a possible workaround. I would like to know if it
seems reasonable. The idea would be to build a hash, on the client
side, over the problematic column (let's say column a). I then store in
the db the attribute a (without index) and the hash(a) (with an index).
Then when I am doing a select, I use firstly a sub-select to choose all
tuples with the right hash (quick, with index), and then an outer
select to choose the tuple with the right attribute a (slow, sequential
scan, but normally few tuples, because few collisions). Something like
that:

Perhaps you should look into functional indexes. Indexes over a
function.

CREATE INDEX foo_index ON foo( hash(a) );

This index will automatically be used if you make a query like this:

... WHERE hash(a) = 'blah';

Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

Tom Lane

tgl@sss.pgh.pa.us

almost 20 years ago

In reply to: pajai (#1)

Re: limit over attribute size if index over it exists

"pajai" <patrick.jayet@gmail.com> writes:

I have got an issue with PostgreSQL. There is a limitation on the
column length of a tuple, in case there is an index over it. In the
actual project I am working on, I meet such a situation. I have got an
attribute over which I am doing a search (that is, I need an index over
it), but this attribute can be in some cases very large (100KB+).

It sounds to me a lot like you may be in need of full-text-index code ---
see contrib/tsearch2. If you were not trying to roll-your-own text
searcher, please give more details. I can hardly imagine a situation in
which it is useful to make a btree index on 100KB values.

regards, tom lane