Thread: Oversize proc sources (was Re: [BUGS] Backend dies creating plpgsql procedures (with reproducible example!))

Wayne Piekarski <wayne@senet.com.au> writes:
> the other day I did a pg_dump of our 6.4.2 database and tried to load it
> back into 6.5 - it failed with the error message:

> FATAL 1:  btree: failed to add item to the page

IIRC this just means the tuple is too long ... btrees want to be able to
fit at least two tuples per disk page, so indexed fields can't exceed
4k bytes in a stock installation.  Sometimes you'll get away with more,
but not if two such keys end up on the same btree page.

It's not real clear to me *why* we are keeping an index on the prosrc
field of pg_proc, but we evidently are, so plpgsql source code can't
safely exceed 4k per proc as things stand.

In short, it was only by chance that you were able to put this set of
procs into 6.4 in the first place :-(

Can any hackers comment on whether pg_proc_prosrc_index is really
necessary??  Just dropping it would allow plpgsql sources to approach 8k,
and I can't think of any scenario where it's needed...

BTW, Jan has been muttering about compressing plpgsql source, which
would provide some more breathing room for big procs, but not before 6.6.

            regards, tom lane

> It's not real clear to me *why* we are keeping an index on the prosrc
> field of pg_proc, but we evidently are, so plpgsql source code can't
> safely exceed 4k per proc as things stand.
>
> In short, it was only by chance that you were able to put this set of
> procs into 6.4 in the first place :-(
>
> Can any hackers comment on whether pg_proc_prosrc_index is really
> necessary??  Just dropping it would allow plpgsql sources to approach 8k,
> and I can't think of any scenario where it's needed...
>
> BTW, Jan has been muttering about compressing plpgsql source, which
> would provide some more breathing room for big procs, but not before 6.6.

Good question.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

> Wayne Piekarski <wayne@senet.com.au> writes:
> > the other day I did a pg_dump of our 6.4.2 database and tried to load it
> > back into 6.5 - it failed with the error message:
>
> > FATAL 1:  btree: failed to add item to the page
>
> IIRC this just means the tuple is too long ... btrees want to be able to
> fit at least two tuples per disk page, so indexed fields can't exceed
> 4k bytes in a stock installation.  Sometimes you'll get away with more,
> but not if two such keys end up on the same btree page.

Ok, well this is quite interesting actually. The test example I sent had
very large procedures, but my actual real life case contains functions
with length(prosrc) = 2082, 2059, 18888, 1841, 1525 ... etc bytes long. So
I am nowhere near 4096 bytes, but I have crossed the 2048 byte boundary.

The error message is the same for both my test case and the real life
pg_dump so I'm not sure what this indicates. Is the problem actually at
2048 bytes?

> It's not real clear to me *why* we are keeping an index on the prosrc
> field of pg_proc, but we evidently are, so plpgsql source code can't
> safely exceed 4k per proc as things stand.
>
> In short, it was only by chance that you were able to put this set of
> procs into 6.4 in the first place :-(

Yeah, this makes sense now. When we used to reload our procedures, I
always did a vacuum before hand which seemed to make it more reliable, and
then we would only replace one function at a time (ie, never a bulk reload
of all our functions).

Every so often we'd have a problem when playing with test databases, but
we were always careful with our real one so managed to avoid it.

> > Can any hackers comment on whether pg_proc_prosrc_index is really
> necessary??  Just dropping it would allow plpgsql sources to approach 8k,
> and I can't think of any scenario where it's needed...

Eeeep! I went and tried this and got some really bizarre behaviour:

psql>UPDATE pg_class SET relname = 'dog' WHERE relname ='pg_proc_prosrc_index';
postgres> mv pg_proc_prosrc_index dog
psql> DROP INDEX pg_proc_prosrc_index;

Then, whenever I try to insert a function into pg_proc:

create function "test" (int4, text) RETURNS int4 AS
'/home/postgres/functions.so' LANGUAGE 'c';

The backend dies, but the errlog contains no error message at all.
/var/log/messages says the backend died with a segmentation fault. Eeep!


So I don't know why this is dying, is the way I dropped the index ok? I
couldn't think of any other way to do this because the backend won't let
me drop or work on any pg_* tables.

> BTW, Jan has been muttering about compressing plpgsql source, which
> would provide some more breathing room for big procs, but not before 6.6.

I would be happy to drop the pg_proc_prosrc_index - now that I know the
limits of plpgsql functions I can rewrite them to call other functions or
something like that to make sure they fit within 4k, but mine are dying at
2k as well, which is bad :(

I personally would think the prosrc index could go because what kind of
query could possibly use this index?


thanks for your help,
Wayne

------------------------------------------------------------------------------
Wayne Piekarski                               Tel:     (08) 8221 5221
Research & Development Manager                Fax:     (08) 8221 5220
SE Network Access Pty Ltd                     Mob:     0407 395 889
222 Grote Street                              Email:   wayne@senet.com.au
Adelaide SA 5000                              WWW:     http://www.senet.com.au