Thread: Bug with plpgsql, temp tables and TOAST?
Hi, When I run the SQL below, I get an error on the third call to foo() : "ERRO= R: could not open relation with OID 884693". I'm quite sure this OID belon= gs to the TOAST table corresponding to the temporary table created by foo()= during the third call. The fourth call works fine again. I suspect the third one fails because the data is too large (even when comp= ressed) to be stored without using toast. I can't reproduce the issue if fo= r example I use a long string of identical characters instead of "random" o= nes. My guess would be that the TOAST table is still somehow referenced by = the result value, even though the table itself has been dropped by the time= the result value is used. I have tested this myself on 8.4.4 and it has also been verified on a 9.1 b= y someone on #postgresql. Kind regards, Matthijs Bomhoff CREATE OR REPLACE FUNCTION foo(size_ INTEGER) RETURNS TEXT AS $EOF$ DECLARE acc_ TEXT :=3D ''; cur_rec_ RECORD; BEGIN EXECUTE 'CREATE TEMPORARY TABLE foo_tab(blob TEXT NOT NULL)'; -- Construct a string with random characters to prevent compression (with= high probability) LOOP EXIT WHEN length(acc_) >=3D size_; acc_ :=3D acc_ || chr(ceil(random()*64)::integer + 32); END LOOP; EXECUTE 'INSERT INTO foo_tab(blob) values (' || quote_literal(acc_) || ')= '; EXECUTE 'SELECT * FROM foo_tab LIMIT 1' INTO cur_rec_; EXECUTE 'DROP TABLE foo_tab'; RETURN cur_rec_.blob; END $EOF$ LANGUAGE plpgsql; SELECT md5(foo(10)); SELECT md5(foo(20)); SELECT md5(foo(40000)); -- This one breaks on my 8.4.4 SELECT md5(foo(30)); -- And this one works fine again DROP FUNCTION foo(INTEGER);
Excerpts from Matthijs Bomhoff's message of mié jun 29 07:40:07 -0400 2011: > CREATE OR REPLACE FUNCTION foo(size_ INTEGER) RETURNS TEXT AS $EOF$ > DECLARE > acc_ TEXT := ''; > cur_rec_ RECORD; > BEGIN > EXECUTE 'CREATE TEMPORARY TABLE foo_tab(blob TEXT NOT NULL)'; > > -- Construct a string with random characters to prevent compression (with high probability) > LOOP > EXIT WHEN length(acc_) >= size_; > acc_ := acc_ || chr(ceil(random()*64)::integer + 32); > END LOOP; > > EXECUTE 'INSERT INTO foo_tab(blob) values (' || quote_literal(acc_) || ')'; > EXECUTE 'SELECT * FROM foo_tab LIMIT 1' INTO cur_rec_; > EXECUTE 'DROP TABLE foo_tab'; > RETURN cur_rec_.blob; > END > $EOF$ LANGUAGE plpgsql; Hmm, so what's happening here, I think, is that the value is getting assigned to the record variable without detoasting. I guess we should detoast the value prior to assigning it, but it seems to me that that would have a large performance penalty for other cases in which the toast table is not dropped; in fact, you can even imagine some cases in which the toasted value is not even accessed, so getting to the point of detoasting it would be a severe penalization. -- Ãlvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Jun 29, 2011, at 6:03 PM, Alvaro Herrera wrote: > Excerpts from Matthijs Bomhoff's message of mi=E9 jun 29 07:40:07 -0400 2= 011: >=20 >> CREATE OR REPLACE FUNCTION foo(size_ INTEGER) RETURNS TEXT AS $EOF$ >> DECLARE >> acc_ TEXT :=3D ''; >> cur_rec_ RECORD; >> BEGIN >> EXECUTE 'CREATE TEMPORARY TABLE foo_tab(blob TEXT NOT NULL)'; >>=20 >> -- Construct a string with random characters to prevent compression (wi= th high probability) >> LOOP >> EXIT WHEN length(acc_) >=3D size_; >> acc_ :=3D acc_ || chr(ceil(random()*64)::integer + 32); >> END LOOP; >>=20 >> EXECUTE 'INSERT INTO foo_tab(blob) values (' || quote_literal(acc_) || = ')'; >> EXECUTE 'SELECT * FROM foo_tab LIMIT 1' INTO cur_rec_; >> EXECUTE 'DROP TABLE foo_tab'; >> RETURN cur_rec_.blob; >> END >> $EOF$ LANGUAGE plpgsql; >=20 > Hmm, so what's happening here, I think, is that the value is getting > assigned to the record variable without detoasting. I guess we should > detoast the value prior to assigning it, but it seems to me that that > would have a large performance penalty for other cases in which the > toast table is not dropped; in fact, you can even imagine some cases in > which the toasted value is not even accessed, so getting to the point of > detoasting it would be a severe penalization. Possibly related: In some earlier attempts at reproducing this, I actually = tried to use length() instead of md5(). It seemed that I could not get it t= o trigger with that. Just selecting * triggers it of course, but caused a b= it too much clutter in my psql for the required long random strings, so I h= ad to find a function that would actually need the value itself. Hence the = md5(). Regards, Matthijs
Excerpts from Matthijs Bomhoff's message of mié jun 29 15:20:49 -0400 2011: > Possibly related: In some earlier attempts at reproducing this, I > actually tried to use length() instead of md5(). It seemed that I > could not get it to trigger with that. Just selecting * triggers it of > course, but caused a bit too much clutter in my psql for the required > long random strings, so I had to find a function that would actually > need the value itself. Hence the md5(). Yeah, the length can be obtained from the toast header AFAIR, so you don't need to do a full detoast of the value, which masks the problem. -- Ãlvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Hmm, so what's happening here, I think, is that the value is getting > assigned to the record variable without detoasting. I guess we should > detoast the value prior to assigning it, but it seems to me that that > would have a large performance penalty for other cases in which the > toast table is not dropped; in fact, you can even imagine some cases in > which the toasted value is not even accessed, so getting to the point of > detoasting it would be a severe penalization. Yeah, we have seen similar reports before, and concluded that forcibly detoasting on the remote chance that the toast table wouldn't stick around was just untenable from a performance standpoint. It's annoying but I don't see a good compromise solution ATM. Pavel was working recently on a hack to prevent repeated detoastings, which is at least somewhat related to this issue. I don't recall that anyone liked his hack though ... regards, tom lane