Thread: ERROR row is too big size 9336, exceeds size 8160 when populating record with tsquery and tsvector fields

Hi,
I am experiencing the "row is too big" error (with postgreSQL-9.0.1)
when populating a table having a tsquery and tsvector fields.
Are fields of tsquery and tsvector datatypes affected by this row size
restriction?


Allan.

Allan Kamau <kamauallan@gmail.com> writes:
> I am experiencing the "row is too big" error (with postgreSQL-9.0.1)
> when populating a table having a tsquery and tsvector fields.

Could we see the exact declaration of your table, please?  Did you
play games with the STORAGE attribute of any of your columns?

            regards, tom lane

On Thu, 2010-11-18 at 07:37 +0300, Allan Kamau wrote:
> Hi,
> I am experiencing the "row is too big" error (with postgreSQL-9.0.1)
> when populating a table having a tsquery and tsvector fields.
> Are fields of tsquery and tsvector datatypes affected by this row size
> restriction?

Uhh... what index type did you use on the column?

>
>
> Allan.
>

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


On Thu, Nov 18, 2010 at 8:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Allan Kamau <kamauallan@gmail.com> writes:
>> I am experiencing the "row is too big" error (with postgreSQL-9.0.1)
>> when populating a table having a tsquery and tsvector fields.
>
> Could we see the exact declaration of your table, please?  Did you
> play games with the STORAGE attribute of any of your columns?
>
>                        regards, tom lane
>


As requested below is the table definition.

It seems the tsvector field did not contribute to this error as the
error occurred when I attempted to populate the tsquery field with
rather long tsquery data. Without populating the tsvector field but
got the same error with the same size message indicating that the data
in the tsvector fields do not lead to this problem.


CREATE TABLE farm.produce
(id INTEGER NOT NULL DEFAULT NEXTVAL('farm.produce_seq')
,process___id TEXT NOT NULL
,item_names tsvector NULL
,product__ids__tsquery tsquery NULL
,product__ids__tsvector tsvector NULL
,population_time TIMESTAMP NOT NULL DEFAULT clock_timestamp()
,PRIMARY KEY(id)
)
;

There are currently no indexes on the fields of type tsquery or
tsvector in my table.

Allan.

On Thursday 18 November 2010 05:37:51 Allan Kamau wrote:
> Hi,
> I am experiencing the "row is too big" error (with postgreSQL-9.0.1)
> when populating a table having a tsquery and tsvector fields.
> Are fields of tsquery and tsvector datatypes affected by this row size
> restriction?
Looks like you tried to index the text field itself with a btree?

Andres

Allan Kamau <kamauallan@gmail.com> writes:
> CREATE TABLE farm.produce
> (id INTEGER NOT NULL DEFAULT NEXTVAL('farm.produce_seq')
> ,process___id TEXT NOT NULL
> ,item_names tsvector NULL
> ,product__ids__tsquery tsquery NULL
> ,product__ids__tsvector tsvector NULL
> ,population_time TIMESTAMP NOT NULL DEFAULT clock_timestamp()
> ,PRIMARY KEY(id)
> )

> It seems the tsvector field did not contribute to this error as the
> error occurred when I attempted to populate the tsquery field with
> rather long tsquery data. Without populating the tsvector field but
> got the same error with the same size message indicating that the data
> in the tsvector fields do not lead to this problem.

Yeah, on poking into the system catalogs I see that tsquery is declared
as not supporting toasting (it has typstorage = 'p'lain).  I don't
offhand know the implementation reason for that or whether it would be
a good idea to change it.  But obviously Teodor's expectation was that
nobody would ever want to store large tsqueries on disk.  I guess it
would be worth asking what's your use-case for storing tsquery, as
opposed to tsvector?

            regards, tom lane

On Thu, Nov 18, 2010 at 6:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Allan Kamau <kamauallan@gmail.com> writes:
>> CREATE TABLE farm.produce
>> (id INTEGER NOT NULL DEFAULT NEXTVAL('farm.produce_seq')
>> ,process___id TEXT NOT NULL
>> ,item_names tsvector NULL
>> ,product__ids__tsquery tsquery NULL
>> ,product__ids__tsvector tsvector NULL
>> ,population_time TIMESTAMP NOT NULL DEFAULT clock_timestamp()
>> ,PRIMARY KEY(id)
>> )
>
>> It seems the tsvector field did not contribute to this error as the
>> error occurred when I attempted to populate the tsquery field with
>> rather long tsquery data. Without populating the tsvector field but
>> got the same error with the same size message indicating that the data
>> in the tsvector fields do not lead to this problem.
>
> Yeah, on poking into the system catalogs I see that tsquery is declared
> as not supporting toasting (it has typstorage = 'p'lain).  I don't
> offhand know the implementation reason for that or whether it would be
> a good idea to change it.  But obviously Teodor's expectation was that
> nobody would ever want to store large tsqueries on disk.  I guess it
> would be worth asking what's your use-case for storing tsquery, as
> opposed to tsvector?
>
>                        regards, tom lane
>
I do have two tables in a many to many relationship, let's call these
tables 'A' and 'B'. I am supposed to obtain all possible interactions
between tuples in 'A' (cross join), where such interaction is via
common or shared tuple(s) of 'B'. It somehow ends up been a recursive
interaction detection exercise because the actual requirement is
finding for any group of tuples in 'A' all the tuples in 'B' that they
share.
Being a non trivial exercise I was hoping to also test performance
without making use of a joining table and instead store the ids of all
tuples of 'B'  related to a given tuple 'A' in the specific tuple of
'A' where such ids values of 'B' will be stored as ored as tsquery.
Then introduce another field in 'B' of tsquery type having the single
value of the records id field. Then for each crossjoin of tuples in
'A' I would use the  tsquery from either side of this join to query on
the tsvector of table 'A'.
Similar implementation using a joining table is currently under way.

Regards,
Allan.