limit over attribute size if index over it exists

Started by pajaialmost 20 years ago3 messagesgeneral
Jump to latest
#1pajai
patrick.jayet@gmail.com

Hi everybody,

I have got an issue with PostgreSQL. There is a limitation on the
column length of a tuple, in case there is an index over it. In the
actual project I am working on, I meet such a situation. I have got an
attribute over which I am doing a search (that is, I need an index over
it), but this attribute can be in some cases very large (100KB+).

The log message I get from Postgres, if I try to insert a tuple with
such a big attribute (e.g. 10K) is the following:

ERROR: index row requires 15704 bytes, maximum size is 8191

(PostgreSQL 8.07 under Linux. The index is a btree index.)

I have thought of a possible workaround. I would like to know if it
seems reasonable. The idea would be to build a hash, on the client
side, over the problematic column (let's say column a). I then store in
the db the attribute a (without index) and the hash(a) (with an index).
Then when I am doing a select, I use firstly a sub-select to choose all
tuples with the right hash (quick, with index), and then an outer
select to choose the tuple with the right attribute a (slow, sequential
scan, but normally few tuples, because few collisions). Something like
that:

SELECT b
FROM (
SELECT a, b
FROM foo
WHERE hash='<hash(a)>'
) as bar
WHERE bar.a='<a>'

(Actually, in my case the situation is slightly more complicated
because I don't have just one attribute but 2+, so there are some index
types that I cannot use. Anyway the principle is the same).

Does this solution seem reasonable, or is there other (more elegant)
ways to do that?

Thank you in advance.
Cheers,

Pat

#2Martijn van Oosterhout
kleptog@svana.org
In reply to: pajai (#1)
Re: limit over attribute size if index over it exists

On Mon, Jun 26, 2006 at 02:52:56AM -0700, pajai wrote:

I have thought of a possible workaround. I would like to know if it
seems reasonable. The idea would be to build a hash, on the client
side, over the problematic column (let's say column a). I then store in
the db the attribute a (without index) and the hash(a) (with an index).
Then when I am doing a select, I use firstly a sub-select to choose all
tuples with the right hash (quick, with index), and then an outer
select to choose the tuple with the right attribute a (slow, sequential
scan, but normally few tuples, because few collisions). Something like
that:

Perhaps you should look into functional indexes. Indexes over a
function.

CREATE INDEX foo_index ON foo( hash(a) );

This index will automatically be used if you make a query like this:

... WHERE hash(a) = 'blah';

Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: pajai (#1)
Re: limit over attribute size if index over it exists

"pajai" <patrick.jayet@gmail.com> writes:

I have got an issue with PostgreSQL. There is a limitation on the
column length of a tuple, in case there is an index over it. In the
actual project I am working on, I meet such a situation. I have got an
attribute over which I am doing a search (that is, I need an index over
it), but this attribute can be in some cases very large (100KB+).

It sounds to me a lot like you may be in need of full-text-index code ---
see contrib/tsearch2. If you were not trying to roll-your-own text
searcher, please give more details. I can hardly imagine a situation in
which it is useful to make a btree index on 100KB values.

regards, tom lane