index row requires 10040 bytes, maximum size is 8191

Started by akp geekover 15 years ago11 messagesgeneral

akpgeek@gmail.com

over 15 years ago

Hi all -

I am trying to create an index on character varying field. The
column is just character varying with no limit. I am getting the following
error " index row requires 10040 bytes, maximum size is 8191"
What can I do the postgres.conf to handle this error? Appreciate your help

Regards

Joshua D. Drake

jd@commandprompt.com

over 15 years ago

In reply to: akp geek (#1)

Re: index row requires 10040 bytes, maximum size is 8191

On Fri, 2010-11-12 at 15:47 -0500, akp geek wrote:

Hi all -

I am trying to create an index on character varying field.
The
column is just character varying with no limit. I am getting the
following
error " index row requires 10040 bytes, maximum size is 8191"
What can I do the postgres.conf to handle this error? Appreciate your
help

You can't. You could create a index on "part" of the data or use full
text.

Regards

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

Steve Crawford

scrawford@pinpointresearch.com

over 15 years ago

In reply to: Joshua D. Drake (#2)

Re: index row requires 10040 bytes, maximum size is 8191

On 11/12/2010 12:52 PM, Joshua D. Drake wrote:

On Fri, 2010-11-12 at 15:47 -0500, akp geek wrote:

Hi all -

I am trying to create an index on character varying field.
The
column is just character varying with no limit. I am getting the
following
error " index row requires 10040 bytes, maximum size is 8191"
What can I do the postgres.conf to handle this error? Appreciate your
help

You can't. You could create a index on "part" of the data or use full
text.

JD

In some cases, depending on needs, indexing on the md5 hash of such data
can be useful.

Cheers,
Steve

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: Joshua D. Drake (#2)

Re: index row requires 10040 bytes, maximum size is 8191

On 13/11/2010 4:52 AM, Joshua D. Drake wrote:

On Fri, 2010-11-12 at 15:47 -0500, akp geek wrote:

Hi all -

I am trying to create an index on character varying field.
The
column is just character varying with no limit. I am getting the
following
error " index row requires 10040 bytes, maximum size is 8191"
What can I do the postgres.conf to handle this error? Appreciate your
help

You can't. You could create a index on "part" of the data or use full
text.

Ouch, really?

I'd always assumed that btree indexes of big TOASTed values would do a
prefix match check then recheck against the heap if there's a match.
More fool me for making such an assumption.

This doesn't seem like a problem when dealing with fields that're meant
to contain big blobs of text, but it's a strong contraindication for the
advice usually given on this list to avoid varchar(n) in favour of
"text". If a "text" field >8kb in an indexed column will be rejected
because it cannot be indexed, that's a reason to set an explicit limit.
Additionally, not having such constraints would make it much harder to
*add* indexes to "text" columns not already indexed.

craig=> create table test ( x text );
craig=> create index test_x on test(x);
craig=> insert into test(x) values ( repeat('x', 9000) );
INSERT 0 1
craig=> insert into test(x) values ( repeat('x', 90000) );
INSERT 0 1
craig=> insert into test(x) values ( repeat('x', 900000) );
ERROR: index row requires 10324 bytes, maximum size is 8191

It seems like an index method that's smart enough to do prefix-and-hash
comparision, then a heap recheck, would be ideal. It's not really a
common enough issue to be a TODO item, though, as this is the first time
I've seen it come up on the list.

Thoughts, folks? Does this matter in practice, since anything you'd want
to index will in practice be small enough or a candidate for full-text
indexing?

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Joshua D. Drake

jd@commandprompt.com

over 15 years ago

In reply to: Craig Ringer (#4)

Re: index row requires 10040 bytes, maximum size is 8191

On Sat, 2010-11-13 at 09:48 +0800, Craig Ringer wrote:

Thoughts, folks? Does this matter in practice, since anything you'd want
to index will in practice be small enough or a candidate for full-text
indexing?

I have run into this problem maybe 3 times in my whole career, precisely
because if you are dealing with text that big, you move to full text
search.

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Joshua D. Drake (#5)

Re: index row requires 10040 bytes, maximum size is 8191

"Joshua D. Drake" <jd@commandprompt.com> writes:

On Sat, 2010-11-13 at 09:48 +0800, Craig Ringer wrote:

Thoughts, folks? Does this matter in practice, since anything you'd want
to index will in practice be small enough or a candidate for full-text
indexing?

I have run into this problem maybe 3 times in my whole career, precisely
because if you are dealing with text that big, you move to full text
search.

Yeah, the real question here is exactly what do you think a btree index
on a large text column will get you? It seems fairly unlikely that
either simple equality or simple range checks are very useful for such
data. I guess there's some use case for uniqueness checks, which we've
seen people approximate by unique-indexing the md5 hash of the column
value.

BTW, the 8K limit applies after possible in-line compression, so the
actual data value causing the failure was likely considerably longer
than 10K.

regards, tom lane

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: Tom Lane (#6)

Re: index row requires 10040 bytes, maximum size is 8191

On 11/13/2010 11:15 AM, Tom Lane wrote:

"Joshua D. Drake"<jd@commandprompt.com> writes:

On Sat, 2010-11-13 at 09:48 +0800, Craig Ringer wrote:

Thoughts, folks? Does this matter in practice, since anything you'd want
to index will in practice be small enough or a candidate for full-text
indexing?

I have run into this problem maybe 3 times in my whole career, precisely
because if you are dealing with text that big, you move to full text
search.

Yeah, the real question here is exactly what do you think a btree index
on a large text column will get you?

About the only useful case I can see is with text data of very irregular
size. The vast majority is small, but there are a few massively bigger
items. It'd be nice if the index method had a fallback for items too big
to index in this case, such as a prefix match and heap recheck.

Of course, I've never run into this in practice, and if I did I'd be
wondering if I had my schema design quite right. I can't imagine that
the mostly aesthetic improvement of eliminating this indexing limitation
would be worth the effort. I'd never ask or want anyone to waste their
time on it, and don't intend to myself. Most of the interesting "big
text" indexing problems are solved by tsearch and/or functional indexes.

--
Craig Ringer

akp geek

akpgeek@gmail.com

over 15 years ago

In reply to: Craig Ringer (#7)

Re: index row requires 10040 bytes, maximum size is 8191

Thanks for all your valuable thoughts . It took me a while to read all your
suggestions , still trying to understand it fully.

What we do with the text that I have mentioned is, we have
search functionality on the text. The user enters some keywords and then the
application should be able to search for all the text that matches the key
word.

On Sat, Nov 13, 2010 at 7:21 PM, Craig Ringer
<craig@postnewspapers.com.au>wrote:

Show quoted text

On 11/13/2010 11:15 AM, Tom Lane wrote:

"Joshua D. Drake"<jd@commandprompt.com> writes:

On Sat, 2010-11-13 at 09:48 +0800, Craig Ringer wrote:

Thoughts, folks? Does this matter in practice, since anything you'd want
to index will in practice be small enough or a candidate for full-text
indexing?

I have run into this problem maybe 3 times in my whole career, precisely

because if you are dealing with text that big, you move to full text
search.

Yeah, the real question here is exactly what do you think a btree index
on a large text column will get you?

About the only useful case I can see is with text data of very irregular
size. The vast majority is small, but there are a few massively bigger
items. It'd be nice if the index method had a fallback for items too big to
index in this case, such as a prefix match and heap recheck.

Of course, I've never run into this in practice, and if I did I'd be
wondering if I had my schema design quite right. I can't imagine that the
mostly aesthetic improvement of eliminating this indexing limitation would
be worth the effort. I'd never ask or want anyone to waste their time on it,
and don't intend to myself. Most of the interesting "big text" indexing
problems are solved by tsearch and/or functional indexes.

--
Craig Ringer

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: akp geek (#8)

Re: index row requires 10040 bytes, maximum size is 8191

akp geek <akpgeek@gmail.com> writes:

What we do with the text that I have mentioned is, we have
search functionality on the text. The user enters some keywords and then the
application should be able to search for all the text that matches the key
word.

Well, a btree index on the entire string is approximately useless for
that anyway. Look into the text search functionality.

regards, tom lane

#10

Michael Shepanski

michael.shepanski@netpage.com

over 14 years ago

In reply to: Craig Ringer (#4)

Re: index row requires 10040 bytes, maximum size is 8191

Craig Ringer writes:

Thoughts, folks? Does this matter in practice, since anything you'd
want to index will in practice be small enough or a candidate for
full-text indexing?

Here's my case: the field is meant to hold a person's name, so it's
usually well under the 8191-byte limit. I want it indexed so that I can
sort on it quickly. I also want it to be robust against abuse, so if
someone does enter the text of _War and Peace_ as their name,
I want to handle that gracefully.

I can achieve that by my own programming, outside of postgresql,
and that's fine if I'm the only one who has gotten into this fix. Otoh
if it's a common problem then there's a reason why removing the
limitation might matter in practice.

Regards,
--- Michael

#11

Scott Marlowe

scott.marlowe@gmail.com

over 14 years ago

In reply to: Michael Shepanski (#10)

Re: index row requires 10040 bytes, maximum size is 8191

On Sun, Sep 18, 2011 at 11:07 PM, Michael Shepanski
<michael.shepanski@netpage.com> wrote:

Craig Ringer writes:

Thoughts, folks? Does this matter in practice, since anything you'd
want to index will in practice be small enough or a candidate for
full-text indexing?

Here's my case: the field is meant to hold a person's name, so it's
usually well under the 8191-byte limit. I want it indexed so that I can
sort on it quickly. I also want it to be robust against abuse, so if
someone does enter the text of _War and Peace_ as their name,
I want to handle that gracefully.

I can achieve that by my own programming, outside of postgresql,
and that's fine if I'm the only one who has gotten into this fix. Otoh
if it's a common problem then there's a reason why removing the
limitation might matter in practice.

What if someone enters the text of war and piece * 1,000,000,000 or
so? or * 10E26 or something. There has to be a limit where you tell
the user, usually sooner rather than later, that some size is just too
big. And in the case of a name field, 1,000 is probably a
realistically high enough max that it should never be a problem,
unless the name is one of the candidates for the very silly party.
Allowing users to input some great big huge chunk of text is opening
yourself up to a denial of service scenario. someone writes a bot
that submits names of 100M over and over, and watch the system just
fall over from network, io, or memory overload.