Thread: Bug with plpgsql, temp tables and TOAST?

Bug with plpgsql, temp tables and TOAST?

From
Matthijs Bomhoff
Date:
Hi,

When I run the SQL below, I get an error on the third call to foo() : "ERRO=
R:  could not open relation with OID 884693". I'm quite sure this OID belon=
gs to the TOAST table corresponding to the temporary table created by foo()=
 during the third call. The fourth call works fine again.

I suspect the third one fails because the data is too large (even when comp=
ressed) to be stored without using toast. I can't reproduce the issue if fo=
r example I use a long string of identical characters instead of "random" o=
nes. My guess would be that the TOAST table is still somehow referenced by =
the result value, even though the table itself has been dropped by the time=
 the result value is used.

I have tested this myself on 8.4.4 and it has also been verified on a 9.1 b=
y someone on #postgresql.

Kind regards,

Matthijs Bomhoff



CREATE OR REPLACE FUNCTION foo(size_ INTEGER) RETURNS TEXT AS $EOF$
DECLARE
  acc_ TEXT :=3D '';
  cur_rec_ RECORD;
BEGIN
  EXECUTE 'CREATE TEMPORARY TABLE foo_tab(blob TEXT NOT NULL)';

  -- Construct a string with random characters to prevent compression (with=
 high probability)
  LOOP
    EXIT WHEN length(acc_) >=3D size_;
    acc_ :=3D acc_ || chr(ceil(random()*64)::integer + 32);
  END LOOP;

  EXECUTE 'INSERT INTO foo_tab(blob) values (' || quote_literal(acc_) || ')=
';
  EXECUTE 'SELECT * FROM foo_tab LIMIT 1' INTO cur_rec_;
  EXECUTE 'DROP TABLE foo_tab';
  RETURN cur_rec_.blob;
END
$EOF$ LANGUAGE plpgsql;

SELECT md5(foo(10));
SELECT md5(foo(20));
SELECT md5(foo(40000)); -- This one breaks on my 8.4.4
SELECT md5(foo(30)); -- And this one works fine again

DROP FUNCTION foo(INTEGER);

Re: Bug with plpgsql, temp tables and TOAST?

From
Alvaro Herrera
Date:
Excerpts from Matthijs Bomhoff's message of mié jun 29 07:40:07 -0400 2011:

> CREATE OR REPLACE FUNCTION foo(size_ INTEGER) RETURNS TEXT AS $EOF$
> DECLARE
>   acc_ TEXT := '';
>   cur_rec_ RECORD;
> BEGIN
>   EXECUTE 'CREATE TEMPORARY TABLE foo_tab(blob TEXT NOT NULL)';
>
>   -- Construct a string with random characters to prevent compression (with high probability)
>   LOOP
>     EXIT WHEN length(acc_) >= size_;
>     acc_ := acc_ || chr(ceil(random()*64)::integer + 32);
>   END LOOP;
>
>   EXECUTE 'INSERT INTO foo_tab(blob) values (' || quote_literal(acc_) || ')';
>   EXECUTE 'SELECT * FROM foo_tab LIMIT 1' INTO cur_rec_;
>   EXECUTE 'DROP TABLE foo_tab';
>   RETURN cur_rec_.blob;
> END
> $EOF$ LANGUAGE plpgsql;

Hmm, so what's happening here, I think, is that the value is getting
assigned to the record variable without detoasting.  I guess we should
detoast the value prior to assigning it, but it seems to me that that
would have a large performance penalty for other cases in which the
toast table is not dropped; in fact, you can even imagine some cases in
which the toasted value is not even accessed, so getting to the point of
detoasting it would be a severe penalization.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Bug with plpgsql, temp tables and TOAST?

From
Matthijs Bomhoff
Date:
On Jun 29, 2011, at 6:03 PM, Alvaro Herrera wrote:

> Excerpts from Matthijs Bomhoff's message of mi=E9 jun 29 07:40:07 -0400 2=
011:
>=20
>> CREATE OR REPLACE FUNCTION foo(size_ INTEGER) RETURNS TEXT AS $EOF$
>> DECLARE
>>  acc_ TEXT :=3D '';
>>  cur_rec_ RECORD;
>> BEGIN
>>  EXECUTE 'CREATE TEMPORARY TABLE foo_tab(blob TEXT NOT NULL)';
>>=20
>>  -- Construct a string with random characters to prevent compression (wi=
th high probability)
>>  LOOP
>>    EXIT WHEN length(acc_) >=3D size_;
>>    acc_ :=3D acc_ || chr(ceil(random()*64)::integer + 32);
>>  END LOOP;
>>=20
>>  EXECUTE 'INSERT INTO foo_tab(blob) values (' || quote_literal(acc_) || =
')';
>>  EXECUTE 'SELECT * FROM foo_tab LIMIT 1' INTO cur_rec_;
>>  EXECUTE 'DROP TABLE foo_tab';
>>  RETURN cur_rec_.blob;
>> END
>> $EOF$ LANGUAGE plpgsql;
>=20
> Hmm, so what's happening here, I think, is that the value is getting
> assigned to the record variable without detoasting.  I guess we should
> detoast the value prior to assigning it, but it seems to me that that
> would have a large performance penalty for other cases in which the
> toast table is not dropped; in fact, you can even imagine some cases in
> which the toasted value is not even accessed, so getting to the point of
> detoasting it would be a severe penalization.

Possibly related: In some earlier attempts at reproducing this, I actually =
tried to use length() instead of md5(). It seemed that I could not get it t=
o trigger with that. Just selecting * triggers it of course, but caused a b=
it too much clutter in my psql for the required long random strings, so I h=
ad to find a function that would actually need the value itself. Hence the =
md5().

Regards,

Matthijs

Re: Bug with plpgsql, temp tables and TOAST?

From
Alvaro Herrera
Date:
Excerpts from Matthijs Bomhoff's message of mié jun 29 15:20:49 -0400 2011:

> Possibly related: In some earlier attempts at reproducing this, I
> actually tried to use length() instead of md5(). It seemed that I
> could not get it to trigger with that. Just selecting * triggers it of
> course, but caused a bit too much clutter in my psql for the required
> long random strings, so I had to find a function that would actually
> need the value itself. Hence the md5().

Yeah, the length can be obtained from the toast header AFAIR, so you
don't need to do a full detoast of the value, which masks the problem.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Bug with plpgsql, temp tables and TOAST?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Hmm, so what's happening here, I think, is that the value is getting
> assigned to the record variable without detoasting.  I guess we should
> detoast the value prior to assigning it, but it seems to me that that
> would have a large performance penalty for other cases in which the
> toast table is not dropped; in fact, you can even imagine some cases in
> which the toasted value is not even accessed, so getting to the point of
> detoasting it would be a severe penalization.

Yeah, we have seen similar reports before, and concluded that forcibly
detoasting on the remote chance that the toast table wouldn't stick
around was just untenable from a performance standpoint.  It's annoying
but I don't see a good compromise solution ATM.

Pavel was working recently on a hack to prevent repeated detoastings,
which is at least somewhat related to this issue.  I don't recall that
anyone liked his hack though ...

            regards, tom lane