Thread: Migrating an application with Oracle temporary tables
One of the issues I have run into in migrating from Oracle to PostgreSQL is the difference in temporary table behavior. I've come up with what I think is a cunning plan, but I wonder if anyone with more PostgreSQL understanding can shoot holes in it.
Oracle defines their temporary tables statically, such that the first use in a session instantiates a private instance of the table. That implies that you need to jump through flaming hoops in order to be able to use temporary tables in stored procedures and functions.
My idea was to create a "prototype" table (which will never have any rows in it) and, when I need to use the temporary table, create the temporary table such that it hides the prototype. I have a simple example that seems to work, although before I invest more effort in using it for real work, I'd like to get a second opinion as to whether this approach is going to fail in a more complex scenario.
drop function if exists bar;
drop table if exists foo;
create table FOO(X INTEGER not null constraint PTCC_FOO check (X is NULL));
commit;
create function bar() returns INTEGER
as
$$
DECLARE
result INTEGER;
begin
select SUM(X) into result from FOO ;
return result;
end;
$$ language plpgsql;
commit;
create temporary table FOO(X INTEGER) on commit drop;
alter table FOO add constraint PK_FOO primary KEY(X);
insert into FOO(X) VALUES(1);
insert into FOO(X) VALUES(2);
insert into FOO(X) VALUES(3);
select 'Expect to see 6' as Expectation, BAR();
commit;
create temporary table FOO(X INTEGER) on commit drop;
alter table FOO add constraint PK_FOO primary key (X);
insert into FOO(X) VALUES(1);
insert into FOO(X) VALUES(2);
insert into FOO(X) VALUES(3);
insert into FOO(X) VALUES(4);
select 'Expect to see 10' as Expectation, BAR();
rollback;
create temporary table FOO(X INTEGER) on commit drop;
insert into FOO(X) VALUES(1);
insert into FOO(X) VALUES(2);
select 'Expect to see 3' as Expectation, BAR();
commit;
What I am not clear on is what the rules are as to when a function/procedure is effectively recompiled. Is there a danger that. assuming the temporary table is created for a session that one session might see another session's data due to the procedure having effectively compiled the temporary table into its definition?
While this approach does have the disadvantage of requiring the application to define the temporary table before using it (which could be as simple as using `CREATE TABLE AS SELECT * FROM prototype_table`), it seems simpler and potentially more performant than the approach I found here: https://www.codeproject.com/Articles/1176045/Oracle-style-global-temporary-tables-for-PostgreSQ
It is also in direct opposition to this post I found: https://www.cybertec-postgresql.com/en/using-temporary-tables-the-way-they-should-not-be-used/
So far, I have not found a case where, as long as I don’t read or write to the permanent table, I get the wrong results from the above approach. It allows me to minimize the impact on my application (basically, it means that at the start of any transaction that might need a certain temporary table, I need to manually create it. The number of places I would need to do that is relatively finite, so I’m willing to take that hit, in exchange for not having to use dynamic SQL to refer to my temporary tables.
Mark Zellers wrote: > One of the issues I have run into in migrating from Oracle to PostgreSQL is the difference in temporary table behavior. > I've come up with what I think is a cunning plan, but I wonder if anyone with more PostgreSQL understanding can shoot holesin it. > > Oracle defines their temporary tables statically, such that the first use in a session instantiates a private instanceof the table. > That implies that you need to jump through flaming hoops in order to be able to use temporary tables in stored proceduresand functions. > > My idea was to create a "prototype" table (which will never have any rows in it) and, when I need to use the temporarytable, > create the temporary table such that it hides the prototype. I have a simple example that seems to work, > although before I invest more effort in using it for real work, I'd like to get a second opinion as to whether this > approach is going to fail in a more complex scenario. [...] > What I am not clear on is what the rules are as to when a function/procedure is effectively recompiled. > Is there a danger that. assuming the temporary table is created for a session that one session might see another session'sdata due > to the procedure having effectively compiled the temporary table into its definition? There is no danger of that, because a function is parsed whenever it is executed, and certainly different database sessions never share any "compiled version" of the function or query execution plans. What does get cached are the execution plans of SQL statements in a PL/pgSQL function, but only across different calls in the same database session. This does not seem to cause problems in your case (the plans are probably invalidated; I am not certain), but you can avoid that behavior by using dynamic SQL (EXECUTE 'SELECT ...'). > While this approach does have the disadvantage of requiring the application to define the temporary table before usingit (which > could be as simple as using `CREATE TABLE AS SELECT * FROM prototype_table`), it seems simpler and potentially more performantthan > the approach I found here: https://www.codeproject.com/Articles/1176045/Oracle-style-global-temporary-tables-for-PostgreSQ > > It is also in direct opposition to this post I found: https://www.cybertec-postgresql.com/en/using-temporary-tables-the-way-they-should-not-be-used/ In this case, the "trouble and pain" behavior is what you explicitly want, so don't worry. > So far, I have not found a case where, as long as I don’t read or write to the permanent table, I get the wrong resultsfrom > the above approach. It allows me to minimize the impact on my application (basically, it means that at the start of anytransaction > that might need a certain temporary table, I need to manually create it. The number of places I would need to do thatis relatively finite, > so I’m willing to take that hit, in exchange for not having to use dynamic SQL to refer to my temporary tables. The biggest problem I see with your approach is that dropping temporary tables causes a DELETE in the system catalogs "pg_class" and "pg_attribute", and if you do that very often, particularly the latter table tends to get bloated. Either don't drop temporary tables all the time or make sure that "pg_attribute" gets vacuumed fast enough. Other than that, I don't really see the need for keeping a permanent "blueprint" table around; all this can lead to is confusion. True, it is cute to use CREATE TABLE ... (LIKE ...), but a few lines more in your code won't kill you. Particularly since you have that only once in your code, right? Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
What I am not clear on is what the rules are as to when a function/procedure is effectively recompiled. Is there a danger that. assuming the temporary table is created for a session that one session might see another session's data due to the procedure having effectively compiled the temporary table into its definition?
While this approach does have the disadvantage of requiring the application to define the temporary table before using it (which could be as simple as using `CREATE TABLE AS SELECT * FROM prototype_table`), it seems simpler and potentially more performant than the approach I found here: https://www.codeproject.com/Articles/1176045/Oracle-style-global-temporary-tables-for-PostgreSQ
It is also in direct opposition to this post I found: https://www.cybertec-postgresql.com/en/using-temporary-tables-the-way-they-should-not-be-used/
So far, I have not found a case where, as long as I don’t read or write to the permanent table, I get the wrong results from the above approach. It allows me to minimize the impact on my application (basically, it means that at the start of any transaction that might need a certain temporary table, I need to manually create it. The number of places I would need to do that is relatively finite, so I’m willing to take that hit, in exchange for not having to use dynamic SQL to refer to my temporary tables.
Laurenz, I thought I needed the prototype table to be able to define functions and procedures that refer to the temporary table butdo not create it. Perhaps my assumption that I need the table to exist (whether as a temporary table or as a permanent table) in order to definethe function/procedure is incorrect. I'll take a look at that. I did find a scenario where this approach does run into trouble. That is, if the function/procedure is executed againstthe permanent table and then you go to run it against a temporary table. In that case, I do get the wrong answer,and I haven't yet figured out how to reset that without dropping the procedure and re-defining it. For my purposes,that is "good enough" -- I can promise not to run such procedures against the temporary table. Thanks for the reply, Mark Z. -----Original Message----- From: Laurenz Albe <laurenz.albe@cybertec.at> Sent: Thursday, May 2, 2019 1:00 AM To: Mark Zellers <markz@adaptiveinsights.com>; pgsql-general@lists.postgresql.org Subject: Re: Migrating an application with Oracle temporary tables Mark Zellers wrote: > One of the issues I have run into in migrating from Oracle to PostgreSQL is the difference in temporary table behavior. > I've come up with what I think is a cunning plan, but I wonder if anyone with more PostgreSQL understanding can shoot holesin it. > > Oracle defines their temporary tables statically, such that the first use in a session instantiates a private instanceof the table. > That implies that you need to jump through flaming hoops in order to be able to use temporary tables in stored proceduresand functions. > > My idea was to create a "prototype" table (which will never have any > rows in it) and, when I need to use the temporary table, create the > temporary table such that it hides the prototype. I have a simple > example that seems to work, although before I invest more effort in using it for real work, I'd like to get a second opinionas to whether this approach is going to fail in a more complex scenario. [...] > What I am not clear on is what the rules are as to when a function/procedure is effectively recompiled. > Is there a danger that. assuming the temporary table is created for a > session that one session might see another session's data due to the procedure having effectively compiled the temporarytable into its definition? There is no danger of that, because a function is parsed whenever it is executed, and certainly different database sessionsnever share any "compiled version" of the function or query execution plans. What does get cached are the execution plans of SQL statements in a PL/pgSQL function, but only across different calls inthe same database session. This does not seem to cause problems in your case (the plans are probably invalidated; I am not certain), but you can avoidthat behavior by using dynamic SQL (EXECUTE 'SELECT ...'). > While this approach does have the disadvantage of requiring the > application to define the temporary table before using it (which could > be as simple as using `CREATE TABLE AS SELECT * FROM > prototype_table`), it seems simpler and potentially more performant > than the approach I found here: > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.codeproject.c > om_Articles_1176045_Oracle-2Dstyle-2Dglobal-2Dtemporary-2Dtables-2Dfor > -2DPostgreSQ&d=DwIDaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r= > WK2G_319M5jcG-UdecAitIGR-UzTlASAsP7ybqXtD2Y&m=gYKOaSte_-ZRnU2RgctTQfAn > J9BhFhxowJBy7upH7KE&s=I_x0KCEfHgBXOoSRYXpAehb5uRodwY11-m6JRYf9A_Y&e= > > It is also in direct opposition to this post I found: > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cybertec-2Dpo > stgresql.com_en_using-2Dtemporary-2Dtables-2Dthe-2Dway-2Dthey-2Dshould > -2Dnot-2Dbe-2Dused_&d=DwIDaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEd > Oozc&r=WK2G_319M5jcG-UdecAitIGR-UzTlASAsP7ybqXtD2Y&m=gYKOaSte_-ZRnU2Rg > ctTQfAnJ9BhFhxowJBy7upH7KE&s=OJVABOenTvrP2J1ubWXbaMayh1AVLOp2ieQ4HeacV > 0I&e= In this case, the "trouble and pain" behavior is what you explicitly want, so don't worry. > So far, I have not found a case where, as long as I don’t read or > write to the permanent table, I get the wrong results from the above > approach. It allows me to minimize the impact on my application > (basically, it means that at the start of any transaction that might need a certain temporary table, I need to manuallycreate it. The number of places I would need to do that is relatively finite, so I’m willing to take that hit, inexchange for not having to use dynamic SQL to refer to my temporary tables. The biggest problem I see with your approach is that dropping temporary tables causes a DELETE in the system catalogs "pg_class"and "pg_attribute", and if you do that very often, particularly the latter table tends to get bloated. Eitherdon't drop temporary tables all the time or make sure that "pg_attribute" gets vacuumed fast enough. Other than that, I don't really see the need for keeping a permanent "blueprint" table around; all this can lead to is confusion. True, it is cute to use CREATE TABLE ... (LIKE ...), but a few lines more in your code won't kill you. Particularlysince you have that only once in your code, right? Yours, Laurenz Albe -- Cybertec | https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cybertec-2Dpostgresql.com&d=DwIDaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=WK2G_319M5jcG-UdecAitIGR-UzTlASAsP7ybqXtD2Y&m=gYKOaSte_-ZRnU2RgctTQfAnJ9BhFhxowJBy7upH7KE&s=Sg60ggj5ptVnAaaRbEl0NpGMBm8_GoHO_msgOqa8vgE&e=
On Thu, 2019-05-02 at 16:55 +0000, Mark Zellers wrote: > I thought I needed the prototype table to be able to define functions and procedures that refer to the temporary tablebut do not create it. > > Perhaps my assumption that I need the table to exist (whether as a temporary table or as a permanent table) in > order to define the function/procedure is incorrect. I'll take a look at that. You don't need the table to exist at function definition time. The following works just fine, even if the table does not exist: CREATE FUNCTION f() RETURNS void LANGUAGE plpgsql AS $$BEGIN PERFORM * FROM notexists; END;$$; This is because functions are not parsed when they are defined. > I did find a scenario where this approach does run into trouble. That is, if the function/procedure is executed > against the permanent table and then you go to run it against a temporary table. In that case, I do get the > wrong answer, and I haven't yet figured out how to reset that without dropping the procedure and re-defining it. > For my purposes, that is "good enough" -- I can promise not to run such procedures against the temporary table. Yes, that would cause a problem. The SQL statement "DISCARD PLANS" should fix the problem. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Thu, 2019-05-02 at 16:55 +0000, Mark Zellers wrote:
> I thought I needed the prototype table to be able to define functions and procedures that refer to the temporary table but do not create it.
>
> Perhaps my assumption that I need the table to exist (whether as a temporary table or as a permanent table) in
> order to define the function/procedure is incorrect. I'll take a look at that.
You don't need the table to exist at function definition time.
The following works just fine, even if the table does not exist:
CREATE FUNCTION f() RETURNS void LANGUAGE plpgsql AS $$BEGIN PERFORM * FROM notexists; END;$$;
This is because functions are not parsed when they are defined.
> I did find a scenario where this approach does run into trouble. That is, if the function/procedure is executed
> against the permanent table and then you go to run it against a temporary table. In that case, I do get the
> wrong answer, and I haven't yet figured out how to reset that without dropping the procedure and re-defining it.
> For my purposes, that is "good enough" -- I can promise not to run such procedures against the temporary table.
Yes, that would cause a problem.
The SQL statement "DISCARD PLANS" should fix the problem.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com