Thread: index row size exceeds btree maximum, 2713 - Solutions?
I'm trying to load some data into PostgreSQL 8.0.3, and I got the error message "index row size 2904 exceeds btree maximum, 2713". After a bunch of searching, I believe that I am getting this error because a value that I am indexing is longer than ~ 1/3 of the block size - or the BLCKSZ variable in the src/include/pg_config_manual.h file. Am I correct so far? I need to fix this problem. I cannot change the indexed columns. I cannot shorten the data value. And I cannot MD5 it, or any of those hashing types of solutions that I saw a lot while searching. Is there a variable I can set somewhere, so that postgresql would just truncate the value to the max length that the index can handle when it goes to enter it into the index, instead of failing with an error? I would be fine with not having this particular row fully indexed, so long as I could still retrieve the full data value. The other solution that I saw was to modify the BLCKSZ variable. From what I saw, it appears that to change that variable, I would need to dump my databases out, recompile everything, and then reload them from scratch. Is this correct? Currently the BLCKSZ variable is set to 8192. What are the performance/disk usage/other? implications of doubling this value, to 16384? Any other suggestions in dealing with this problem? Thanks, Dan
On Mon, Jul 18, 2005 at 14:44:26 -0500, Dan Armbrust <daniel.armbrust.list@gmail.com> wrote: > I'm trying to load some data into PostgreSQL 8.0.3, and I got the error > message "index row size 2904 exceeds btree maximum, 2713". After a > bunch of searching, I believe that I am getting this error because a > value that I am indexing is longer than ~ 1/3 of the block size - or the > BLCKSZ variable in the src/include/pg_config_manual.h file. > > Am I correct so far? > > I need to fix this problem. I cannot change the indexed columns. I > cannot shorten the data value. And I cannot MD5 it, or any of those > hashing types of solutions that I saw a lot while searching. Can you explain how you are using the index now and what you are trying to accomplish? It is hard to suggest alternatives without knowing what you are really trying to do.
We have built a Model for terminologies that we call The Lexical Grid (more info http://informatics.mayo.edu/LexGrid/index.php) LexGrid has multiple backend data storage mechanisms, including LDAP and SQL. We do our best to remain implementation independent - our SQL implementations, for example can run against MS Access, DB2, MySQL and PostgreSQL. I'm currently trying to load a new terminology into a PosgreSQL backend, and arrived at this error because it happens to have a couple of very large data values that get mapped into the 'propertyvalue' field. The structure of the table that I am (currently) having problems with is: (apologies if your client does not parse HTML - this is what is convenient to me right now) codingschemename character varying(70) NOT NULL <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=codingschemename&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=codingschemename&">Drop conceptcode character varying(100) NOT NULL <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=conceptcode&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=conceptcode&">Drop propertyid character varying(50) NOT NULL <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=propertyid&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=propertyid&">Drop property character varying(250) NOT NULL <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=property&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=property&">Drop language character varying(32) <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=language&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=language&">Drop presentationformat character varying(50) <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=presentationformat&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=presentationformat&">Drop datatype character varying(50) <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=datatype&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=datatype&">Drop ispreferred boolean <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=ispreferred&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=ispreferred&">Drop degreeoffidelity character varying(50) <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=degreeoffidelity&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=degreeoffidelity&">Drop matchifnocontext boolean <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=matchifnocontext&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=matchifnocontext&">Drop representationalform character varying(50) <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=representationalform&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=representationalform&">Drop propertyvalue text NOT NULL <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=properties&database=LexGrid2&schema=public&table=conceptproperty&column=propertyvalue&">Alter <a href="http://mir04.mayo.edu/pgadmin/tblproperties.php?action=confirm_drop&database=LexGrid2&schema=public&table=conceptproperty&column=propertyvalue&">Drop The structure of the table is not easily changed, as it closely follows our model. The index that is failing is CREATE INDEX i1 ON conceptproperty USING btree (codingschemename, property, propertyvalue). Usually, the 'propertyValue' field is fairly short - 100 chars or less. And in those cases, I need to be able to do an indexed search on it. In this particular case, this terminology has a propertyValue that is very long. I can't just toss it. I don't really care if it makes it into the index - for a terminology with as much data as this has, I'll be using Lucene to do text searches anyway - but I do need to be able to return the full propertyValue in response to a propertyId query. I also don't want to negatively affect the performance of the rest of the DB just to accommodate this instance - hence the questions about the implications of changing the BLCKSZ variable. Dan Bruno Wolff III wrote: On Mon, Jul 18, 2005 at 14:44:26 -0500, Dan Armbrust <daniel.armbrust.list@gmail.com> wrote: I'm trying to load some data into PostgreSQL 8.0.3, and I got the error message "index row size 2904 exceeds btree maximum, 2713". After a bunch of searching, I believe that I am getting this error because a value that I am indexing is longer than ~ 1/3 of the block size - or the BLCKSZ variable in the src/include/pg_config_manual.h file. Am I correct so far? I need to fix this problem. I cannot change the indexed columns. I cannot shorten the data value. And I cannot MD5 it, or any of those hashing types of solutions that I saw a lot while searching. Can you explain how you are using the index now and what you are trying to accomplish? It is hard to suggest alternatives without knowing what you are really trying to do.
> I'm trying to load some data into PostgreSQL 8.0.3, and I got the error > message "index row size 2904 exceeds btree maximum, 2713". After a > bunch of searching, I believe that I am getting this error because a > value that I am indexing is longer than ~ 1/3 of the block size - or the > BLCKSZ variable in the src/include/pg_config_manual.h file. > > Am I correct so far? > > I need to fix this problem. I cannot change the indexed columns. > can you drop the index and retry? what typo is the column? -- Atentamente, Jaime Casanova (DBA: DataBase Aniquilator ;)
> The index that is failing is CREATE INDEX i1 ON conceptproperty USING btree > (codingschemename, property, propertyvalue). > I don't think you could create indexes on text fields... there are other type of indexes for that... tsearch2 for example -- Atentamente, Jaime Casanova (DBA: DataBase Aniquilator ;)
Dan Armbrust <daniel.armbrust.list@gmail.com> writes: > The index that is failing is CREATE INDEX i1 ON conceptproperty USING > btree (codingschemename, property, propertyvalue).<br> > <br> > Usually, the 'propertyValue' field is fairly short - 100 chars or > less. And in those cases, I need to be able to do an indexed search on > it. You do realize that the above index has nothing whatever to do with searches on propertyvalue? (Or at least not propertyvalue alone.) My advice is to drop propertyvalue from the index and see what performance you get. You could possibly do something with a partial index, eg create index ... where length(propertyvalue) < 2000 but that'd probably require modifying your queries, so I suspect it's not really very practical. regards, tom lane
On Mon, 2005-07-18 at 15:17, Dan Armbrust wrote: > We have built a Model for terminologies that we call The Lexical Grid > (more info http://informatics.mayo.edu/LexGrid/index.php) > > LexGrid has multiple backend data storage mechanisms, including LDAP > and SQL. We do our best to remain implementation independent - our > SQL implementations, for example can run against MS Access, DB2, MySQL > and PostgreSQL. > > I'm currently trying to load a new terminology into a PosgreSQL > backend, and arrived at this error because it happens to have a couple > of very large data values that get mapped into the 'propertyvalue' > field. Well, if you're trying to maintain compatibility to multiple backends, then requiring a non-standard block size is going to be a big non-starter for most folks running postgresql. Very few, if any users, are going to be willing to install a version of postgresql setup that way just because of one program's needs. Does this need to be a unique index for any reason? If so, then an md5(propertyvalue) would give you that. If not, then you could make a partial index like so: create index yada_dx on yada (propertyvalue) where length(propertybalue)<100; But that would require you to add "where length(propertyvalue)<100" onto your queries to use the index, I believe. OTOH, this may be the time to use a hash index.
Hmm, well, I don't know if it is actually building an index properly on this column, I just assumed that it was. It doesn't fail on every insert, only on the one that has a really long text value. I know it doesn't use the index when I do "ILIKE" queries, resulting in poor performance... but I assumed that was because I was trying to do a case insensitve search on a case sensitive column index. I didn't want to go down the road of writing even more database implementation specific code. I will usually be using Lucene for the full text searches anyway. Where is the documentation on tsearch2? I haven't seen it mentioned anywhere except a couple of mailing list postings. All of my other limitations on changing things aside - given a query like this: Select * from conceptproperty where codingSchemeName='foo' AND property='anotherfoo' and propertyValue ILIKE 'valu%' What indexe(s) would be recommended? My current 3 column index (that works on other DB's) doesn't perform well due to case sensitivity issues, and now fails, due to data length issues. Dan Dan Jaime Casanova wrote: <blockquote cite="midc2d9e70e05071813323a499a41@mail.gmail.com" type="cite"> The index that is failing is CREATE INDEX i1 ON conceptproperty USING btree (codingschemename, property, propertyvalue). I don't think you could create indexes on text fields... there are other type of indexes for that... tsearch2 for example
Nevermind this question... > Where is the documentation on tsearch2? Google first, ask second, I remind myself again... I knew I hadn't seen it mentioned in the official postgresql manual.. didn't think about it being an extension. Dan Dan Armbrust wrote: Hmm, well, I don't know if it is actually building an index properly on this column, I just assumed that it was. It doesn't fail on every insert, only on the one that has a really long text value. I know it doesn't use the index when I do "ILIKE" queries, resulting in poor performance... but I assumed that was because I was trying to do a case insensitve search on a case sensitive column index. I didn't want to go down the road of writing even more database implementation specific code. I will usually be using Lucene for the full text searches anyway. I haven't seen it mentioned anywhere except a couple of mailing list postings. All of my other limitations on changing things aside - given a query like this: Select * from conceptproperty where codingSchemeName='foo' AND property='anotherfoo' and propertyValue ILIKE 'valu%' What indexe(s) would be recommended? My current 3 column index (that works on other DB's) doesn't perform well due to case sensitivity issues, and now fails, due to data length issues. Dan Dan Jaime Casanova wrote: <blockquote cite="midc2d9e70e05071813323a499a41@mail.gmail.com" type="cite"> The index that is failing is CREATE INDEX i1 ON conceptproperty USING btree (codingschemename, property, propertyvalue). I don't think you could create indexes on text fields... there are other type of indexes for that... tsearch2 for example
On Mon, 2005-07-18 at 16:01, Dan Armbrust wrote: > Hmm, well, I don't know if it is actually building an index properly > on this column, I just assumed that it was. It doesn't fail on every > insert, only on the one that has a really long text value. I know it > doesn't use the index when I do "ILIKE" queries, resulting in poor > performance... but I assumed that was because I was trying to do a > case insensitve search on a case sensitive column index. I didn't > want to go down the road of writing even more database implementation > specific code. I will usually be using Lucene for the full text > searches anyway. > > Where is the documentation on tsearch2? I haven't seen it mentioned > anywhere except a couple of mailing list postings. > > All of my other limitations on changing things aside - given a query > like this: > > Select * from conceptproperty where codingSchemeName='foo' AND > property='anotherfoo' and propertyValue ILIKE 'valu%' OK, how selective are each of these three fields? If codingSchemeName isn't very selective, then you'll likely never win with an index. Put the most selective item first, that way the database will have the smallest set of data to have to play with to get the answer you asked for. Next, hash indexes might have been the answer, when I didn't know you were using like. Hash indexes, at least on my 7.4 box, can't use like. Ilike can't use indexes period, and it's better to create a functional index: create index indexname on table (lower(fieldname)); and then always search on lower(fieldname); > What indexe(s) would be recommended? > > My current 3 column index (that works on other DB's) doesn't perform > well due to case sensitivity issues, and now fails, due to data length > issues. The reason your three column index likely doesn't work well in postgresql is that you probably don't have much selectivity in the first column. I'm just guessing there aren't that many coding schemes, but there probably are plenty of properties. So, setting up your index with that value first, and in first in your where clause should help, assuming it's more selective. IF you build a testcase on propertvalue, and try ilike, you'll notice that it, quite simply does NOT use indexes.
Dan Armbrust <daniel.armbrust.list@gmail.com> writes: > All of my other limitations on changing things aside - given a query > like this: > > Select * from conceptproperty where codingSchemeName='foo' AND > property='anotherfoo' and propertyValue ILIKE 'valu%' > > What indexe(s) would be recommended? I'd index on codingSchemeName and property and not worry so much about propertyValue. I rather doubt that that part of the query is adding a performance-critical amount of selectivity --- and even if you could fit propertyValue into the index, you'd have to jump through hoops to get a case-insensitive search on it when the other columns are case-sensitive. regards, tom lane
Dan Armbrust wrote: > Hmm, well, I don't know if it is actually building an index properly on > this column, I just assumed that it was. It doesn't fail on every > insert, only on the one that has a really long text value. I know it > doesn't use the index when I do "ILIKE" queries, resulting in poor > performance... but I assumed that was because I was trying to do a case > insensitve search on a case sensitive column index. I didn't want to go > down the road of writing even more database implementation specific > code. I will usually be using Lucene for the full text searches anyway. > > Where is the documentation on tsearch2? I haven't seen it mentioned > anywhere except a couple of mailing list postings. > http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ And a devx article here: http://www.devx.com/opensource/Article/21674/0 > All of my other limitations on changing things aside - given a query > like this: > > Select * from conceptproperty where codingSchemeName='foo' AND > property='anotherfoo' and propertyValue ILIKE 'valu%' > > What indexe(s) would be recommended? Why don't you do this: Select * from conceptproperty where codingSchemeName='foo' AND property='anotherfoo' and propertyValue ~ lower('valu'); and have an index: create index lower_propertyvalue_idx on conceptproperty(lower(propertyvalue)); and have a index: (codingschemename,property) I would also be curious to see an explain analyze. Sincerely, Joshua D. Drake -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
On Mon, 18 Jul 2005, Jaime Casanova wrote: > I don't think you could create indexes on text fields... there are > other type of indexes for that... tsearch2 for example You can index text fields, but you can't insert values bigger then BLOCKSIZE/3 when you have an index on that column. -- /Dennis Björklund
On Tue, 2005-07-19 at 05:42, Dennis Bjorklund wrote: > On Mon, 18 Jul 2005, Jaime Casanova wrote: > > > I don't think you could create indexes on text fields... there are > > other type of indexes for that... tsearch2 for example > > You can index text fields, but you can't insert values bigger then > BLOCKSIZE/3 when you have an index on that column.] Please note that the size limitation is for btree indexes, the most common and well tested index types. For hash you can have a much larger value, but only direct matching is supported. I don't know about GiST...
Scott Marlowe <smarlowe@g2switchworks.com> writes: > On Tue, 2005-07-19 at 05:42, Dennis Bjorklund wrote: >> You can index text fields, but you can't insert values bigger then >> BLOCKSIZE/3 when you have an index on that column.] > Please note that the size limitation is for btree indexes, the most > common and well tested index types. > For hash you can have a much larger value, but only direct matching is > supported. > I don't know about GiST... None of the index types support entries larger than BLOCKSIZE-less-a-bit, so switching to a different index type won't do more than push the problem out by a factor of about 3. regards, tom lane
On Tue, 2005-07-19 at 10:25, Tom Lane wrote: > Scott Marlowe <smarlowe@g2switchworks.com> writes: > > On Tue, 2005-07-19 at 05:42, Dennis Bjorklund wrote: > >> You can index text fields, but you can't insert values bigger then > >> BLOCKSIZE/3 when you have an index on that column.] > > > Please note that the size limitation is for btree indexes, the most > > common and well tested index types. > > For hash you can have a much larger value, but only direct matching is > > supported. > > I don't know about GiST... > > None of the index types support entries larger than BLOCKSIZE-less-a-bit, > so switching to a different index type won't do more than push the > problem out by a factor of about 3. Are they compressed? It would look to me like maybe they are, or something strange like that. When I fed highly compressable data into an indexed field, it took a LOT of said text to get a failure method.
Scott Marlowe <smarlowe@g2switchworks.com> writes: > On Tue, 2005-07-19 at 10:25, Tom Lane wrote: >> None of the index types support entries larger than BLOCKSIZE-less-a-bit, >> so switching to a different index type won't do more than push the >> problem out by a factor of about 3. > Are they compressed? It would look to me like maybe they are, or > something strange like that. When I fed highly compressable data into > an indexed field, it took a LOT of said text to get a failure method. Yes, we do try to compress large index entries --- so the BLOCKSIZE or BLOCKSIZE/3 limitation applies after compression. That's independent of index type AFAIK. What we don't have is a TOAST table backing every index to allow out-of-line storage ... regards, tom lane
Dan Armbrust wrote: > I'm trying to load some data into PostgreSQL 8.0.3, and I got the > error message "index row size 2904 exceeds btree maximum, 2713". > After a bunch of searching, I believe that I am getting this error > because a value that I am indexing is longer than ~ 1/3 of the block > size - or the BLCKSZ variable in the src/include/pg_config_manual.h file. > > Am I correct so far? > > I need to fix this problem. I cannot change the indexed columns. I > cannot shorten the data value. And I cannot MD5 it, or any of those > hashing types of solutions that I saw a lot while searching. > > Is there a variable I can set somewhere, so that postgresql would just > truncate the value to the max length that the index can handle when it > goes to enter it into the index, instead of failing with an error? I > would be fine with not having this particular row fully indexed, so > long as I could still retrieve the full data value. > > The other solution that I saw was to modify the BLCKSZ variable. From > what I saw, it appears that to change that variable, I would need to > dump my databases out, recompile everything, and then reload them > from scratch. Is this correct? > > Currently the BLCKSZ variable is set to 8192. What are the > performance/disk usage/other? implications of doubling this value, to > 16384? > > Any other suggestions in dealing with this problem? > > Thanks, > > Dan > Thanks for all the information and ideas WRT this issue. I ended up just having to remove the index from this particular column that was having the issue - in my particular case, I didn't lose anything by doing this anyway, because the index wasn't being used for its intended purpose anyway, due to case sensitivity issues. Could I suggest adding this error, its causes, and possible solutions from this thread (http://archives.postgresql.org/pgsql-general/2005-07/msg00731.php) to the FAQ? It took me a long time to connect all the dots through a lot of different e-mail threads. Also, maybe the max index size should be documented in the manual as well? Dan -- **************************** Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/