Thread: index row requires 10040 bytes, maximum size is 8191

index row requires 10040 bytes, maximum size is 8191

From
akp geek
Date:
Hi all -

            I am trying to create an index on character varying field.  The column is just character varying with no limit. I am getting the following error " index row requires 10040 bytes, maximum size is 8191"
What can I do the postgres.conf to handle this error? Appreciate your help

Regards

Re: index row requires 10040 bytes, maximum size is 8191

From
"Joshua D. Drake"
Date:
On Fri, 2010-11-12 at 15:47 -0500, akp geek wrote:
> Hi all -
>
>             I am trying to create an index on character varying field.
> The
> column is just character varying with no limit. I am getting the
> following
> error " index row requires 10040 bytes, maximum size is 8191"
> What can I do the postgres.conf to handle this error? Appreciate your
> help

You can't. You could create a index on "part" of the data or use full
text.

JD

>
> Regards
--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


Re: index row requires 10040 bytes, maximum size is 8191

From
Steve Crawford
Date:
On 11/12/2010 12:52 PM, Joshua D. Drake wrote:
> On Fri, 2010-11-12 at 15:47 -0500, akp geek wrote:
>
>> Hi all -
>>
>>              I am trying to create an index on character varying field.
>> The
>> column is just character varying with no limit. I am getting the
>> following
>> error " index row requires 10040 bytes, maximum size is 8191"
>> What can I do the postgres.conf to handle this error? Appreciate your
>> help
>>
> You can't. You could create a index on "part" of the data or use full
> text.
>
> JD
>
>
In some cases, depending on needs, indexing on the md5 hash of such data
can be useful.

Cheers,
Steve


Re: index row requires 10040 bytes, maximum size is 8191

From
Craig Ringer
Date:
On 13/11/2010 4:52 AM, Joshua D. Drake wrote:
> On Fri, 2010-11-12 at 15:47 -0500, akp geek wrote:
>> Hi all -
>>
>>              I am trying to create an index on character varying field.
>> The
>> column is just character varying with no limit. I am getting the
>> following
>> error " index row requires 10040 bytes, maximum size is 8191"
>> What can I do the postgres.conf to handle this error? Appreciate your
>> help
>
> You can't. You could create a index on "part" of the data or use full
> text.

Ouch, really?

I'd always assumed that btree indexes of big TOASTed values would do a
prefix match check then recheck against the heap if there's a match.
More fool me for making such an assumption.

This doesn't seem like a problem when dealing with fields that're meant
to contain big blobs of text, but it's a strong contraindication for the
advice usually given on this list to avoid varchar(n) in favour of
"text". If a "text" field >8kb in an indexed column will be rejected
because it cannot be indexed, that's a reason to set an explicit limit.
Additionally, not having such constraints would make it much harder to
*add* indexes to "text" columns not already indexed.

craig=> create table test ( x text );
craig=> create index test_x on test(x);
craig=> insert into test(x) values ( repeat('x', 9000) );
INSERT 0 1
craig=> insert into test(x) values ( repeat('x', 90000) );
INSERT 0 1
craig=> insert into test(x) values ( repeat('x', 900000) );
ERROR:  index row requires 10324 bytes, maximum size is 8191


It seems like an index method that's smart enough to do prefix-and-hash
comparision, then a heap recheck, would be ideal. It's not really a
common enough issue to be a TODO item, though, as this is the first time
I've seen it come up on the list.

Thoughts, folks? Does this matter in practice, since anything you'd want
to index will in practice be small enough or a candidate for full-text
indexing?

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Re: index row requires 10040 bytes, maximum size is 8191

From
"Joshua D. Drake"
Date:
On Sat, 2010-11-13 at 09:48 +0800, Craig Ringer wrote:

>
> Thoughts, folks? Does this matter in practice, since anything you'd want
> to index will in practice be small enough or a candidate for full-text
> indexing?
>

I have run into this problem maybe 3 times in my whole career, precisely
because if you are dealing with text that big, you move to full text
search.

JD

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


Re: index row requires 10040 bytes, maximum size is 8191

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> On Sat, 2010-11-13 at 09:48 +0800, Craig Ringer wrote:
>> Thoughts, folks? Does this matter in practice, since anything you'd want
>> to index will in practice be small enough or a candidate for full-text
>> indexing?

> I have run into this problem maybe 3 times in my whole career, precisely
> because if you are dealing with text that big, you move to full text
> search.

Yeah, the real question here is exactly what do you think a btree index
on a large text column will get you?  It seems fairly unlikely that
either simple equality or simple range checks are very useful for such
data.  I guess there's some use case for uniqueness checks, which we've
seen people approximate by unique-indexing the md5 hash of the column
value.

BTW, the 8K limit applies after possible in-line compression, so the
actual data value causing the failure was likely considerably longer
than 10K.

            regards, tom lane

Re: index row requires 10040 bytes, maximum size is 8191

From
Craig Ringer
Date:
On 11/13/2010 11:15 AM, Tom Lane wrote:
> "Joshua D. Drake"<jd@commandprompt.com>  writes:
>> On Sat, 2010-11-13 at 09:48 +0800, Craig Ringer wrote:
>>> Thoughts, folks? Does this matter in practice, since anything you'd want
>>> to index will in practice be small enough or a candidate for full-text
>>> indexing?
>
>> I have run into this problem maybe 3 times in my whole career, precisely
>> because if you are dealing with text that big, you move to full text
>> search.
>
> Yeah, the real question here is exactly what do you think a btree index
> on a large text column will get you?

About the only useful case I can see is with text data of very irregular
size. The vast majority is small, but there are a few massively bigger
items. It'd be nice if the index method had a fallback for items too big
to index in this case, such as a prefix match and heap recheck.

Of course, I've never run into this in practice, and if I did I'd be
wondering if I had my schema design quite right. I can't imagine that
the mostly aesthetic improvement of eliminating this indexing limitation
would be worth the effort. I'd never ask or want anyone to waste their
time on it, and don't intend to myself. Most of the interesting "big
text" indexing problems are solved by tsearch and/or functional indexes.

--
Craig Ringer

Re: index row requires 10040 bytes, maximum size is 8191

From
akp geek
Date:
Thanks for all your valuable thoughts . It took me a while to read all your suggestions , still trying to understand it fully. 

What we do with the text that I have mentioned is, we have search functionality on the text. The user enters some keywords and then the application should be able to search for all the text that matches the key word. 

On Sat, Nov 13, 2010 at 7:21 PM, Craig Ringer <craig@postnewspapers.com.au> wrote:
On 11/13/2010 11:15 AM, Tom Lane wrote:
"Joshua D. Drake"<jd@commandprompt.com>  writes:
On Sat, 2010-11-13 at 09:48 +0800, Craig Ringer wrote:
Thoughts, folks? Does this matter in practice, since anything you'd want
to index will in practice be small enough or a candidate for full-text
indexing?

I have run into this problem maybe 3 times in my whole career, precisely
because if you are dealing with text that big, you move to full text
search.

Yeah, the real question here is exactly what do you think a btree index
on a large text column will get you?

About the only useful case I can see is with text data of very irregular size. The vast majority is small, but there are a few massively bigger items. It'd be nice if the index method had a fallback for items too big to index in this case, such as a prefix match and heap recheck.

Of course, I've never run into this in practice, and if I did I'd be wondering if I had my schema design quite right. I can't imagine that the mostly aesthetic improvement of eliminating this indexing limitation would be worth the effort. I'd never ask or want anyone to waste their time on it, and don't intend to myself. Most of the interesting "big text" indexing problems are solved by tsearch and/or functional indexes.

--
Craig Ringer

Re: index row requires 10040 bytes, maximum size is 8191

From
Tom Lane
Date:
akp geek <akpgeek@gmail.com> writes:
> What we do with the text that I have mentioned is, we have
> search functionality on the text. The user enters some keywords and then the
> application should be able to search for all the text that matches the key
> word.

Well, a btree index on the entire string is approximately useless for
that anyway.  Look into the text search functionality.

            regards, tom lane

Re: index row requires 10040 bytes, maximum size is 8191

From
Michael Shepanski
Date:
Craig Ringer writes:
 >
 >Thoughts, folks? Does this matter in practice, since anything you'd
 > want to index will in practice be small enough or a candidate for
 > full-text indexing?

Here's my case: the field is meant to hold a person's name, so it's
usually well under the 8191-byte limit.  I want it indexed so that I can
sort on it quickly.  I also want it to be robust against abuse, so if
someone does enter the text of _War and Peace_ as their name,
I want to handle that gracefully.

I can achieve that by my own programming, outside of postgresql,
and that's fine if I'm the only one who has gotten into this fix.  Otoh
if it's a common problem then there's a reason why removing the
limitation might matter in practice.

Regards,
--- Michael


Re: index row requires 10040 bytes, maximum size is 8191

From
Scott Marlowe
Date:
On Sun, Sep 18, 2011 at 11:07 PM, Michael Shepanski
<michael.shepanski@netpage.com> wrote:
> Craig Ringer writes:
>>
>>Thoughts, folks? Does this matter in practice, since anything you'd
>> want to index will in practice be small enough or a candidate for
>> full-text indexing?
>
> Here's my case: the field is meant to hold a person's name, so it's
> usually well under the 8191-byte limit.  I want it indexed so that I can
> sort on it quickly.  I also want it to be robust against abuse, so if
> someone does enter the text of _War and Peace_ as their name,
> I want to handle that gracefully.
>
> I can achieve that by my own programming, outside of postgresql,
> and that's fine if I'm the only one who has gotten into this fix.  Otoh
> if it's a common problem then there's a reason why removing the
> limitation might matter in practice.

What if someone enters the text of war and piece * 1,000,000,000 or
so?  or * 10E26 or something.  There has to be a limit where you tell
the user, usually sooner rather than later, that some size is just too
big.  And in the case of a name field, 1,000 is probably a
realistically high enough max that it should never be a problem,
unless the name is one of the candidates for the very silly party.
Allowing users to input some great big huge chunk of text is opening
yourself up to a denial of service scenario.  someone writes a bot
that submits names of 100M over and over, and watch the system just
fall over from network, io, or memory overload.