Migrating an application with Oracle temporary tables

Started by Mark Zellersalmost 7 years ago7 messagesgeneral
Jump to latest
#1Mark Zellers
markz@adaptiveinsights.com

One of the issues I have run into in migrating from Oracle to PostgreSQL is the difference in temporary table behavior. I've come up with what I think is a cunning plan, but I wonder if anyone with more PostgreSQL understanding can shoot holes in it.

Oracle defines their temporary tables statically, such that the first use in a session instantiates a private instance of the table. That implies that you need to jump through flaming hoops in order to be able to use temporary tables in stored procedures and functions.

My idea was to create a "prototype" table (which will never have any rows in it) and, when I need to use the temporary table, create the temporary table such that it hides the prototype. I have a simple example that seems to work, although before I invest more effort in using it for real work, I'd like to get a second opinion as to whether this approach is going to fail in a more complex scenario.

drop function if exists bar;
drop table if exists foo;

create table FOO(X INTEGER not null constraint PTCC_FOO check (X is NULL));
commit;

create function bar() returns INTEGER
as
$$
DECLARE
result INTEGER;
begin
select SUM(X) into result from FOO ;
return result;
end;
$$ language plpgsql;

commit;

create temporary table FOO(X INTEGER) on commit drop;
alter table FOO add constraint PK_FOO primary KEY(X);
insert into FOO(X) VALUES(1);
insert into FOO(X) VALUES(2);
insert into FOO(X) VALUES(3);

select 'Expect to see 6' as Expectation, BAR();
commit;

create temporary table FOO(X INTEGER) on commit drop;
alter table FOO add constraint PK_FOO primary key (X);
insert into FOO(X) VALUES(1);
insert into FOO(X) VALUES(2);
insert into FOO(X) VALUES(3);
insert into FOO(X) VALUES(4);
select 'Expect to see 10' as Expectation, BAR();
rollback;

create temporary table FOO(X INTEGER) on commit drop;
insert into FOO(X) VALUES(1);
insert into FOO(X) VALUES(2);
select 'Expect to see 3' as Expectation, BAR();
commit;

What I am not clear on is what the rules are as to when a function/procedure is effectively recompiled. Is there a danger that. assuming the temporary table is created for a session that one session might see another session's data due to the procedure having effectively compiled the temporary table into its definition?

While this approach does have the disadvantage of requiring the application to define the temporary table before using it (which could be as simple as using `CREATE TABLE AS SELECT * FROM prototype_table`), it seems simpler and potentially more performant than the approach I found here: https://www.codeproject.com/Articles/1176045/Oracle-style-global-temporary-tables-for-PostgreSQ

It is also in direct opposition to this post I found: https://www.cybertec-postgresql.com/en/using-temporary-tables-the-way-they-should-not-be-used/

So far, I have not found a case where, as long as I don't read or write to the permanent table, I get the wrong results from the above approach. It allows me to minimize the impact on my application (basically, it means that at the start of any transaction that might need a certain temporary table, I need to manually create it. The number of places I would need to do that is relatively finite, so I'm willing to take that hit, in exchange for not having to use dynamic SQL to refer to my temporary tables.

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Mark Zellers (#1)
Re: Migrating an application with Oracle temporary tables

Mark Zellers wrote:

One of the issues I have run into in migrating from Oracle to PostgreSQL is the difference in temporary table behavior.
I've come up with what I think is a cunning plan, but I wonder if anyone with more PostgreSQL understanding can shoot holes in it.

Oracle defines their temporary tables statically, such that the first use in a session instantiates a private instance of the table.
That implies that you need to jump through flaming hoops in order to be able to use temporary tables in stored procedures and functions.

My idea was to create a "prototype" table (which will never have any rows in it) and, when I need to use the temporary table,
create the temporary table such that it hides the prototype. I have a simple example that seems to work,
although before I invest more effort in using it for real work, I'd like to get a second opinion as to whether this
approach is going to fail in a more complex scenario.

[...]

What I am not clear on is what the rules are as to when a function/procedure is effectively recompiled.
Is there a danger that. assuming the temporary table is created for a session that one session might see another session's data due
to the procedure having effectively compiled the temporary table into its definition?

There is no danger of that, because a function is parsed whenever it is executed, and certainly
different database sessions never share any "compiled version" of the function or query execution plans.

What does get cached are the execution plans of SQL statements in a PL/pgSQL function, but only
across different calls in the same database session.

This does not seem to cause problems in your case (the plans are probably invalidated; I am not certain),
but you can avoid that behavior by using dynamic SQL (EXECUTE 'SELECT ...').

While this approach does have the disadvantage of requiring the application to define the temporary table before using it (which
could be as simple as using `CREATE TABLE AS SELECT * FROM prototype_table`), it seems simpler and potentially more performant than
the approach I found here: https://www.codeproject.com/Articles/1176045/Oracle-style-global-temporary-tables-for-PostgreSQ

It is also in direct opposition to this post I found: https://www.cybertec-postgresql.com/en/using-temporary-tables-the-way-they-should-not-be-used/

In this case, the "trouble and pain" behavior is what you explicitly want, so don't worry.

So far, I have not found a case where, as long as I don’t read or write to the permanent table, I get the wrong results from
the above approach. It allows me to minimize the impact on my application (basically, it means that at the start of any transaction
that might need a certain temporary table, I need to manually create it. The number of places I would need to do that is relatively finite,
so I’m willing to take that hit, in exchange for not having to use dynamic SQL to refer to my temporary tables.

The biggest problem I see with your approach is that dropping temporary tables causes a DELETE
in the system catalogs "pg_class" and "pg_attribute", and if you do that very often, particularly
the latter table tends to get bloated. Either don't drop temporary tables all the time or
make sure that "pg_attribute" gets vacuumed fast enough.

Other than that, I don't really see the need for keeping a permanent "blueprint" table
around; all this can lead to is confusion. True, it is cute to use CREATE TABLE ... (LIKE ...),
but a few lines more in your code won't kill you. Particularly since you have that only once
in your code, right?

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Mark Zellers (#1)
Re: Migrating an application with Oracle temporary tables

Hi

What I am not clear on is what the rules are as to when a

function/procedure is effectively recompiled. Is there a danger that.
assuming the temporary table is created for a session that one session
might see another session's data due to the procedure having effectively
compiled the temporary table into its definition?

it's not a problem - the plpgsql engine is implemented significantly
differently than PL/SQL

plpgsql code is not compiled - it is just validated in "create" time and it
is stored in original text form. When function is started first time in
session, then source code is reparsed again to syntax tree, and this tree
is evaluated. Inner SQL queries are prepared.

When you create and drop table, then prepared statements are automatically
replaned.

Sometimes can be strange (and difficult) so plpgsql is similar to PL/SQL,
but the implementation is maximally different - so some knowledges and
experience are portable (and lot of related to internals) are not portable.

Regards

Pavel

Show quoted text

While this approach does have the disadvantage of requiring the
application to define the temporary table before using it (which could be
as simple as using `CREATE TABLE AS SELECT * FROM prototype_table`), it
seems simpler and potentially more performant than the approach I found
here:
https://www.codeproject.com/Articles/1176045/Oracle-style-global-temporary-tables-for-PostgreSQ

It is also in direct opposition to this post I found:
https://www.cybertec-postgresql.com/en/using-temporary-tables-the-way-they-should-not-be-used/

So far, I have not found a case where, as long as I don’t read or write to
the permanent table, I get the wrong results from the above approach. It
allows me to minimize the impact on my application (basically, it means
that at the start of any transaction that might need a certain temporary
table, I need to manually create it. The number of places I would need to
do that is relatively finite, so I’m willing to take that hit, in exchange
for not having to use dynamic SQL to refer to my temporary tables.

#4Mark Zellers
markz@adaptiveinsights.com
In reply to: Laurenz Albe (#2)
RE: Migrating an application with Oracle temporary tables

Laurenz,

I thought I needed the prototype table to be able to define functions and procedures that refer to the temporary table but do not create it.

Perhaps my assumption that I need the table to exist (whether as a temporary table or as a permanent table) in order to define the function/procedure is incorrect. I'll take a look at that.

I did find a scenario where this approach does run into trouble. That is, if the function/procedure is executed against the permanent table and then you go to run it against a temporary table. In that case, I do get the wrong answer, and I haven't yet figured out how to reset that without dropping the procedure and re-defining it. For my purposes, that is "good enough" -- I can promise not to run such procedures against the temporary table.

Thanks for the reply,

Mark Z.

-----Original Message-----
From: Laurenz Albe <laurenz.albe@cybertec.at>
Sent: Thursday, May 2, 2019 1:00 AM
To: Mark Zellers <markz@adaptiveinsights.com>; pgsql-general@lists.postgresql.org
Subject: Re: Migrating an application with Oracle temporary tables

Mark Zellers wrote:

One of the issues I have run into in migrating from Oracle to PostgreSQL is the difference in temporary table behavior.
I've come up with what I think is a cunning plan, but I wonder if anyone with more PostgreSQL understanding can shoot holes in it.

Oracle defines their temporary tables statically, such that the first use in a session instantiates a private instance of the table.
That implies that you need to jump through flaming hoops in order to be able to use temporary tables in stored procedures and functions.

My idea was to create a "prototype" table (which will never have any
rows in it) and, when I need to use the temporary table, create the
temporary table such that it hides the prototype. I have a simple
example that seems to work, although before I invest more effort in using it for real work, I'd like to get a second opinion as to whether this approach is going to fail in a more complex scenario.

[...]

What I am not clear on is what the rules are as to when a function/procedure is effectively recompiled.
Is there a danger that. assuming the temporary table is created for a
session that one session might see another session's data due to the procedure having effectively compiled the temporary table into its definition?

There is no danger of that, because a function is parsed whenever it is executed, and certainly different database sessions never share any "compiled version" of the function or query execution plans.

What does get cached are the execution plans of SQL statements in a PL/pgSQL function, but only across different calls in the same database session.

This does not seem to cause problems in your case (the plans are probably invalidated; I am not certain), but you can avoid that behavior by using dynamic SQL (EXECUTE 'SELECT ...').

While this approach does have the disadvantage of requiring the
application to define the temporary table before using it (which could
be as simple as using `CREATE TABLE AS SELECT * FROM
prototype_table`), it seems simpler and potentially more performant
than the approach I found here:
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.codeproject.c
om_Articles_1176045_Oracle-2Dstyle-2Dglobal-2Dtemporary-2Dtables-2Dfor
-2DPostgreSQ&d=DwIDaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=
WK2G_319M5jcG-UdecAitIGR-UzTlASAsP7ybqXtD2Y&m=gYKOaSte_-ZRnU2RgctTQfAn
J9BhFhxowJBy7upH7KE&s=I_x0KCEfHgBXOoSRYXpAehb5uRodwY11-m6JRYf9A_Y&e=

It is also in direct opposition to this post I found:
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cybertec-2Dpo
stgresql.com_en_using-2Dtemporary-2Dtables-2Dthe-2Dway-2Dthey-2Dshould
-2Dnot-2Dbe-2Dused_&d=DwIDaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEd
Oozc&r=WK2G_319M5jcG-UdecAitIGR-UzTlASAsP7ybqXtD2Y&m=gYKOaSte_-ZRnU2Rg
ctTQfAnJ9BhFhxowJBy7upH7KE&s=OJVABOenTvrP2J1ubWXbaMayh1AVLOp2ieQ4HeacV
0I&e=

In this case, the "trouble and pain" behavior is what you explicitly want, so don't worry.

So far, I have not found a case where, as long as I don’t read or
write to the permanent table, I get the wrong results from the above
approach. It allows me to minimize the impact on my application
(basically, it means that at the start of any transaction that might need a certain temporary table, I need to manually create it. The number of places I would need to do that is relatively finite, so I’m willing to take that hit, in exchange for not having to use dynamic SQL to refer to my temporary tables.

The biggest problem I see with your approach is that dropping temporary tables causes a DELETE in the system catalogs "pg_class" and "pg_attribute", and if you do that very often, particularly the latter table tends to get bloated. Either don't drop temporary tables all the time or make sure that "pg_attribute" gets vacuumed fast enough.

Other than that, I don't really see the need for keeping a permanent "blueprint" table around; all this can lead to is confusion. True, it is cute to use CREATE TABLE ... (LIKE ...), but a few lines more in your code won't kill you. Particularly since you have that only once in your code, right?

Yours,
Laurenz Albe
--
Cybertec | https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cybertec-2Dpostgresql.com&amp;d=DwIDaQ&amp;c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&amp;r=WK2G_319M5jcG-UdecAitIGR-UzTlASAsP7ybqXtD2Y&amp;m=gYKOaSte_-ZRnU2RgctTQfAnJ9BhFhxowJBy7upH7KE&amp;s=Sg60ggj5ptVnAaaRbEl0NpGMBm8_GoHO_msgOqa8vgE&amp;e=

#5Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Mark Zellers (#4)
Re: Migrating an application with Oracle temporary tables

On Thu, 2019-05-02 at 16:55 +0000, Mark Zellers wrote:

I thought I needed the prototype table to be able to define functions and procedures that refer to the temporary table but do not create it.

Perhaps my assumption that I need the table to exist (whether as a temporary table or as a permanent table) in
order to define the function/procedure is incorrect. I'll take a look at that.

You don't need the table to exist at function definition time.
The following works just fine, even if the table does not exist:

CREATE FUNCTION f() RETURNS void LANGUAGE plpgsql AS $$BEGIN PERFORM * FROM notexists; END;$$;

This is because functions are not parsed when they are defined.

I did find a scenario where this approach does run into trouble. That is, if the function/procedure is executed
against the permanent table and then you go to run it against a temporary table. In that case, I do get the
wrong answer, and I haven't yet figured out how to reset that without dropping the procedure and re-defining it.
For my purposes, that is "good enough" -- I can promise not to run such procedures against the temporary table.

Yes, that would cause a problem.

The SQL statement "DISCARD PLANS" should fix the problem.

Yours,
Laurenz Albe

--
Cybertec | https://www.cybertec-postgresql.com

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Laurenz Albe (#5)
Re: Migrating an application with Oracle temporary tables

pá 3. 5. 2019 v 8:19 odesílatel Laurenz Albe <laurenz.albe@cybertec.at>
napsal:

On Thu, 2019-05-02 at 16:55 +0000, Mark Zellers wrote:

I thought I needed the prototype table to be able to define functions

and procedures that refer to the temporary table but do not create it.

Perhaps my assumption that I need the table to exist (whether as a

temporary table or as a permanent table) in

order to define the function/procedure is incorrect. I'll take a look

at that.

You don't need the table to exist at function definition time.
The following works just fine, even if the table does not exist:

CREATE FUNCTION f() RETURNS void LANGUAGE plpgsql AS $$BEGIN PERFORM *
FROM notexists; END;$$;

This is because functions are not parsed when they are defined.

It is not fully correct - function with queries are parsed and syntax check
is done. But semantic check is deferred on run time.

Regards

Pavel

Show quoted text

I did find a scenario where this approach does run into trouble. That

is, if the function/procedure is executed

against the permanent table and then you go to run it against a

temporary table. In that case, I do get the

wrong answer, and I haven't yet figured out how to reset that without

dropping the procedure and re-defining it.

For my purposes, that is "good enough" -- I can promise not to run such

procedures against the temporary table.

Yes, that would cause a problem.

The SQL statement "DISCARD PLANS" should fix the problem.

Yours,
Laurenz Albe

--
Cybertec | https://www.cybertec-postgresql.com

#7Michael Lewis
mlewis@entrata.com
In reply to: Pavel Stehule (#6)
Re: Migrating an application with Oracle temporary tables

*I did find a scenario where this approach does run into trouble. That is,
if the function/procedure is executed against the permanent table and then
you go to run it against a temporary table. In that case, I do get the
wrong answer, and I haven't yet figured out how to reset that without
dropping the procedure and re-defining it. For my purposes, that is "good
enough" -- I can promise not to run such procedures against the temporary
table.*

You may want to be explicit and refer to the table with schema like *select
* from public.table_name *OR* select * from pg_temp.table_name *to avoid
confusion caused by the default behavior of using the temp table if it
exists.