Re: Implementation of global temporary tables?

Started by Zhaomo Yangover 10 years ago31 messages
#1Zhaomo Yang
zhy001@cs.ucsd.edu

more global temp tables are little bit comfortable for developers,

I'd like to emphasize this point. This feature does much more than saving a
developer from issuing a CREATE TEMP TABLE statement in every session. Here
are two common use cases and I'm sure there are more.

(1)
Imagine in a web application scenario, a developer wants to cache some
session information in a temp table. What's more, he also wants to specify
some rules which reference the session information. Without this feature,
the rules will be removed at the end of every session since they depend on
a temporary object. Global temp tables will allow the developer to define
the temp table and the rules once.

(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about
global temp tables.
(/messages/by-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to
clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the
current behavior of temp tables."

2.a - using on demand created temp tables - most simple solution, but
doesn't help with catalogue bloating

I've read the thread and people disapprove this approach because of the
potential catalog bloat. However, I'd like to champion it. Indeed, this
approach may have a bloat issue. But for users who needs global temp
tables, they now have to create a new temp table in every session, which
means they already have the bloat problem and presumably they have some
policy to deal with it. In other words, implementing global temp tables by
this approach gives users the same performance, plus the convenience the
feature brings.

The root problem here is that whether "whether having the unoptimized
feature is better than
having no feature at all". Actually, there was a very similar discussion
back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom
Lane's arguments here.

Kevin Grittner's argument:

/messages/by-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary
tables in a way that performs better than current temporary tables, that's
cool. That would be a nice "bonus" in addition to the application
programmer convenience and having another tick-mark on the standards
compliance charts. Do you think that's feasible? If not, the feature
would be useful to some with the same performance that temporary tables
currently provide."

Tom Lane's arguments:

/messages/by-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do
that. I don't think that you get to veto implementation of the feature
until we can find a way to optimize it better. The question is not about
whether having the optimization would be better than not having it --- it's
about whether having the unoptimized feature is better than having no
feature at all (which means people have to implement the same behavior by
hand, and they'll *still* not get the optimization)."

There have been several threads here discussing global temp table since
2007. Quite a few ideas aimed to avoid the bloat issue by not storing the
metadata of the session copy in the catalog. However, it seems that none of
them has been implemented, or even has a feasible design. So why don't we
implement it in a unoptimized way first?

Is there still interest about this feature?

I'm very interested in this feature. I'm thinking about one implementation
which is similar to Pavel's 2009 proposal (
/messages/by-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com).
Here are the major ideas of my design:

(1)
Creating the cross-session persistent schema as a regular table and
creating session-private temp tables when a session first accesses it.

(2)
For DML queries, The global temp table is overloaded by its session copy
after the relation is opened by an oid or a rangevar. For DDL queries,
which copy is used depends on whether the query needs to access the data or
metadata of the global temp table.

There are more differences between this design and Pavel's 2009 proposal
and I'd like to send a detailed proposal to the mailing list but first I
want to know if our community would accept a global temp table
implementation which provides the same performance as currently temp tables
do.

Thanks,
Zhaomo

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Zhaomo Yang (#1)

Hi

2015-07-08 9:08 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:

more global temp tables are little bit comfortable for developers,

I'd like to emphasize this point. This feature does much more than saving
a developer from issuing a CREATE TEMP TABLE statement in every session.
Here are two common use cases and I'm sure there are more.

(1)
Imagine in a web application scenario, a developer wants to cache some
session information in a temp table. What's more, he also wants to specify
some rules which reference the session information. Without this feature,
the rules will be removed at the end of every session since they depend on
a temporary object. Global temp tables will allow the developer to define
the temp table and the rules once.

(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about
global temp tables.
(/messages/by-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to
clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the
current behavior of temp tables."

2.a - using on demand created temp tables - most simple solution, but
doesn't help with catalogue bloating

I've read the thread and people disapprove this approach because of the
potential catalog bloat. However, I'd like to champion it. Indeed, this
approach may have a bloat issue. But for users who needs global temp
tables, they now have to create a new temp table in every session, which
means they already have the bloat problem and presumably they have some
policy to deal with it. In other words, implementing global temp tables by
this approach gives users the same performance, plus the convenience the
feature brings.

The root problem here is that whether "whether having the unoptimized
feature is better than
having no feature at all". Actually, there was a very similar discussion
back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom
Lane's arguments here.

Kevin Grittner's argument:

/messages/by-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary
tables in a way that performs better than current temporary tables, that's
cool. That would be a nice "bonus" in addition to the application
programmer convenience and having another tick-mark on the standards
compliance charts. Do you think that's feasible? If not, the feature
would be useful to some with the same performance that temporary tables
currently provide."

Tom Lane's arguments:

/messages/by-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do
that. I don't think that you get to veto implementation of the feature
until we can find a way to optimize it better. The question is not about
whether having the optimization would be better than not having it --- it's
about whether having the unoptimized feature is better than having no
feature at all (which means people have to implement the same behavior by
hand, and they'll *still* not get the optimization)."

There have been several threads here discussing global temp table since
2007. Quite a few ideas aimed to avoid the bloat issue by not storing the
metadata of the session copy in the catalog. However, it seems that none of
them has been implemented, or even has a feasible design. So why don't we
implement it in a unoptimized way first?

I am not sure, if it is not useless work.

Now, I am thinking so best implementation of global temp tables is
enhancing unlogged tables to have local content. All local data can be
saved in session memory. Usually it is less than 2KB with statistic, and
you don't need to store it in catalogue. When anybody is working with any
table, related data are copied to system cache - and there can be injected
a implementation of global temp tables.

regards

Pavel Stehule

Show quoted text

Is there still interest about this feature?

I'm very interested in this feature. I'm thinking about one implementation
which is similar to Pavel's 2009 proposal (
/messages/by-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com).
Here are the major ideas of my design:

(1)
Creating the cross-session persistent schema as a regular table and
creating session-private temp tables when a session first accesses it.

(2)
For DML queries, The global temp table is overloaded by its session copy
after the relation is opened by an oid or a rangevar. For DDL queries,
which copy is used depends on whether the query needs to access the data or
metadata of the global temp table.

There are more differences between this design and Pavel's 2009 proposal
and I'd like to send a detailed proposal to the mailing list but first I
want to know if our community would accept a global temp table
implementation which provides the same performance as currently temp tables
do.

Thanks,
Zhaomo

#3Zhaomo Yang
zhy001@cs.ucsd.edu
In reply to: Pavel Stehule (#2)

I am not sure, if it is not useless work.

I don't understand why an implementation taking approach 2.a would be
useless. As I said, its performance will be no worse than current temp
tables and it will provide a lot of convenience to users who need to create
temp tables in every session.

Thanks,
Zhaomo

On Tue, Jul 7, 2015 at 11:53 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Show quoted text

Hi

2015-07-08 9:08 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:

more global temp tables are little bit comfortable for developers,

I'd like to emphasize this point. This feature does much more than saving
a developer from issuing a CREATE TEMP TABLE statement in every session.
Here are two common use cases and I'm sure there are more.

(1)
Imagine in a web application scenario, a developer wants to cache some
session information in a temp table. What's more, he also wants to specify
some rules which reference the session information. Without this feature,
the rules will be removed at the end of every session since they depend on
a temporary object. Global temp tables will allow the developer to define
the temp table and the rules once.

(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about
global temp tables.
(/messages/by-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to
clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the
current behavior of temp tables."

2.a - using on demand created temp tables - most simple solution, but
doesn't help with catalogue bloating

I've read the thread and people disapprove this approach because of the
potential catalog bloat. However, I'd like to champion it. Indeed, this
approach may have a bloat issue. But for users who needs global temp
tables, they now have to create a new temp table in every session, which
means they already have the bloat problem and presumably they have some
policy to deal with it. In other words, implementing global temp tables by
this approach gives users the same performance, plus the convenience the
feature brings.

The root problem here is that whether "whether having the unoptimized
feature is better than
having no feature at all". Actually, there was a very similar discussion
back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom
Lane's arguments here.

Kevin Grittner's argument:

/messages/by-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary
tables in a way that performs better than current temporary tables, that's
cool. That would be a nice "bonus" in addition to the application
programmer convenience and having another tick-mark on the standards
compliance charts. Do you think that's feasible? If not, the feature
would be useful to some with the same performance that temporary tables
currently provide."

Tom Lane's arguments:

/messages/by-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do
that. I don't think that you get to veto implementation of the feature
until we can find a way to optimize it better. The question is not about
whether having the optimization would be better than not having it --- it's
about whether having the unoptimized feature is better than having no
feature at all (which means people have to implement the same behavior by
hand, and they'll *still* not get the optimization)."

There have been several threads here discussing global temp table since
2007. Quite a few ideas aimed to avoid the bloat issue by not storing the
metadata of the session copy in the catalog. However, it seems that none of
them has been implemented, or even has a feasible design. So why don't we
implement it in a unoptimized way first?

I am not sure, if it is not useless work.

Now, I am thinking so best implementation of global temp tables is
enhancing unlogged tables to have local content. All local data can be
saved in session memory. Usually it is less than 2KB with statistic, and
you don't need to store it in catalogue. When anybody is working with any
table, related data are copied to system cache - and there can be injected
a implementation of global temp tables.

regards

Pavel Stehule

Is there still interest about this feature?

I'm very interested in this feature. I'm thinking about one
implementation which is similar to Pavel's 2009 proposal (
/messages/by-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com).
Here are the major ideas of my design:

(1)
Creating the cross-session persistent schema as a regular table and
creating session-private temp tables when a session first accesses it.

(2)
For DML queries, The global temp table is overloaded by its session copy
after the relation is opened by an oid or a rangevar. For DDL queries,
which copy is used depends on whether the query needs to access the data or
metadata of the global temp table.

There are more differences between this design and Pavel's 2009 proposal
and I'd like to send a detailed proposal to the mailing list but first I
want to know if our community would accept a global temp table
implementation which provides the same performance as currently temp tables
do.

Thanks,
Zhaomo

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Zhaomo Yang (#3)

2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:

I am not sure, if it is not useless work.

I don't understand why an implementation taking approach 2.a would be
useless. As I said, its performance will be no worse than current temp
tables and it will provide a lot of convenience to users who need to create
temp tables in every session.

Surely it should be step forward. But you will to have to solve lot of
problems with "duplicated" tables in system catalogue, and still it doesn't
solve the main problem with temporary tables - the bloating catalogue - and
related performance degradation.

Although global temp tables is nice to have feature (for PLpgSQL
developers), we can live without it - and with some patterns and
extensions, we are living well. But the performance issue is not be fixed
by any pattern. So the major motivation for introduction of global temp
tables is performance - from 90%. It should be a primary target to merge
this feature to upstream. I believe, when bloating will be solved, then the
chance to accept this patch will be pretty high.

Regards

Pavel

Show quoted text

Thanks,
Zhaomo

On Tue, Jul 7, 2015 at 11:53 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Hi

2015-07-08 9:08 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:

more global temp tables are little bit comfortable for developers,

I'd like to emphasize this point. This feature does much more than
saving a developer from issuing a CREATE TEMP TABLE statement in every
session. Here are two common use cases and I'm sure there are more.

(1)
Imagine in a web application scenario, a developer wants to cache some
session information in a temp table. What's more, he also wants to specify
some rules which reference the session information. Without this feature,
the rules will be removed at the end of every session since they depend on
a temporary object. Global temp tables will allow the developer to define
the temp table and the rules once.

(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about
global temp tables.
(/messages/by-id/9319.1272130283@sss.pgh.pa.us)
"The context that I've seen it come up in is that people don't want to
clutter their functions with
create-it-if-it-doesn't-exist logic, which you have to have given the
current behavior of temp tables."

2.a - using on demand created temp tables - most simple solution, but
doesn't help with catalogue bloating

I've read the thread and people disapprove this approach because of the
potential catalog bloat. However, I'd like to champion it. Indeed, this
approach may have a bloat issue. But for users who needs global temp
tables, they now have to create a new temp table in every session, which
means they already have the bloat problem and presumably they have some
policy to deal with it. In other words, implementing global temp tables by
this approach gives users the same performance, plus the convenience the
feature brings.

The root problem here is that whether "whether having the unoptimized
feature is better than
having no feature at all". Actually, there was a very similar discussion
back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom
Lane's arguments here.

Kevin Grittner's argument:

/messages/by-id/49F82AEA.EE98.0025.0@wicourts.gov
"... If you're saying we can implement the standard's global temporary
tables in a way that performs better than current temporary tables, that's
cool. That would be a nice "bonus" in addition to the application
programmer convenience and having another tick-mark on the standards
compliance charts. Do you think that's feasible? If not, the feature
would be useful to some with the same performance that temporary tables
currently provide."

Tom Lane's arguments:

/messages/by-id/24110.1241035178@sss.pgh.pa.us
"I'm all for eliminating catalog overheads, if we can find a way to do
that. I don't think that you get to veto implementation of the feature
until we can find a way to optimize it better. The question is not about
whether having the optimization would be better than not having it --- it's
about whether having the unoptimized feature is better than having no
feature at all (which means people have to implement the same behavior by
hand, and they'll *still* not get the optimization)."

There have been several threads here discussing global temp table since
2007. Quite a few ideas aimed to avoid the bloat issue by not storing the
metadata of the session copy in the catalog. However, it seems that none of
them has been implemented, or even has a feasible design. So why don't we
implement it in a unoptimized way first?

I am not sure, if it is not useless work.

Now, I am thinking so best implementation of global temp tables is
enhancing unlogged tables to have local content. All local data can be
saved in session memory. Usually it is less than 2KB with statistic, and
you don't need to store it in catalogue. When anybody is working with any
table, related data are copied to system cache - and there can be injected
a implementation of global temp tables.

regards

Pavel Stehule

Is there still interest about this feature?

I'm very interested in this feature. I'm thinking about one
implementation which is similar to Pavel's 2009 proposal (
/messages/by-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com).
Here are the major ideas of my design:

(1)
Creating the cross-session persistent schema as a regular table and
creating session-private temp tables when a session first accesses it.

(2)
For DML queries, The global temp table is overloaded by its session copy
after the relation is opened by an oid or a rangevar. For DDL queries,
which copy is used depends on whether the query needs to access the data or
metadata of the global temp table.

There are more differences between this design and Pavel's 2009 proposal
and I'd like to send a detailed proposal to the mailing list but first I
want to know if our community would accept a global temp table
implementation which provides the same performance as currently temp tables
do.

Thanks,
Zhaomo

#5Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Pavel Stehule (#4)

On 7/9/15 12:45 AM, Pavel Stehule wrote:

2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu
<mailto:zhy001@cs.ucsd.edu>>:

I am not sure, if it is not useless work.

I don't understand why an implementation taking approach 2.a would
be useless. As I said, its performance will be no worse than current
temp tables and it will provide a lot of convenience to users who
need to create temp tables in every session.

Surely it should be step forward. But you will to have to solve lot of
problems with "duplicated" tables in system catalogue, and still it
doesn't solve the main problem with temporary tables - the bloating
catalogue - and related performance degradation.

That being the "main" problem is strictly a matter of opinion based on
your experience. Many people don't have a performance problem today, but
do have to deal with all the pain of handling this manually (as well as
all the limitations that go with that).

If it's easy to fix the bloat problem at the same time as adding GLOBAL
TEMP then great! But there's no reason to reject this just because it
doesn't fix that issue.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Simon Riggs
simon@2ndQuadrant.com
In reply to: Jim Nasby (#5)

On 14 July 2015 at 23:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 7/9/15 12:45 AM, Pavel Stehule wrote:

2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu
<mailto:zhy001@cs.ucsd.edu>>:

I am not sure, if it is not useless work.

I don't understand why an implementation taking approach 2.a would
be useless. As I said, its performance will be no worse than current
temp tables and it will provide a lot of convenience to users who
need to create temp tables in every session.

Surely it should be step forward. But you will to have to solve lot of
problems with "duplicated" tables in system catalogue, and still it
doesn't solve the main problem with temporary tables - the bloating
catalogue - and related performance degradation.

That being the "main" problem is strictly a matter of opinion based on
your experience. Many people don't have a performance problem today, but do
have to deal with all the pain of handling this manually (as well as all
the limitations that go with that).

If it's easy to fix the bloat problem at the same time as adding GLOBAL
TEMP then great! But there's no reason to reject this just because it
doesn't fix that issue.

Agreed

There are some good arguments for why we need this feature.

Pavel's original description of how to do this seem valid, and from the
link Tom agreed in 2009.

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but
with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it
does not exist we create it as a TEMP table of the same name, using the
Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility problems or
need for new internals.

The purpose of this feature is to automatically create a temp table with
the same definition whenever needed. The discussion of "bloat" is just
wrong. We create exactly the same amount of bloat as if we had typed CREATE
TEMP TABLE. Optimising temp table entries in the catalog is another,
separate patch, if we care.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Simon Riggs (#6)

On 07/15/2015 07:58 AM, Simon Riggs wrote:

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table
but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table,
if it does not exist we create it as a TEMP table of the same name,
using the Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility
problems or need for new internals.

The purpose of this feature is to automatically create a temp table
with the same definition whenever needed. The discussion of "bloat" is
just wrong. We create exactly the same amount of bloat as if we had
typed CREATE TEMP TABLE. Optimising temp table entries in the catalog
is another, separate patch, if we care.

Sounds fine in general. I'm a bit curious to know what are the locking
implications of vivifying the table on access.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Simon Riggs (#6)

2015-07-15 13:58 GMT+02:00 Simon Riggs <simon@2ndquadrant.com>:

On 14 July 2015 at 23:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 7/9/15 12:45 AM, Pavel Stehule wrote:

2015-07-09 7:32 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu
<mailto:zhy001@cs.ucsd.edu>>:

I am not sure, if it is not useless work.

I don't understand why an implementation taking approach 2.a would
be useless. As I said, its performance will be no worse than current
temp tables and it will provide a lot of convenience to users who
need to create temp tables in every session.

Surely it should be step forward. But you will to have to solve lot of
problems with "duplicated" tables in system catalogue, and still it
doesn't solve the main problem with temporary tables - the bloating
catalogue - and related performance degradation.

That being the "main" problem is strictly a matter of opinion based on
your experience. Many people don't have a performance problem today, but do
have to deal with all the pain of handling this manually (as well as all
the limitations that go with that).

If it's easy to fix the bloat problem at the same time as adding GLOBAL
TEMP then great! But there's no reason to reject this just because it
doesn't fix that issue.

Agreed

There are some good arguments for why we need this feature.

Pavel's original description of how to do this seem valid, and from the
link Tom agreed in 2009.

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but
with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if
it does not exist we create it as a TEMP table of the same name, using the
Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility problems or
need for new internals.

The purpose of this feature is to automatically create a temp table with
the same definition whenever needed. The discussion of "bloat" is just
wrong. We create exactly the same amount of bloat as if we had typed CREATE
TEMP TABLE. Optimising temp table entries in the catalog is another,
separate patch, if we care.

The optimization of local temp tables is little bit harder - you cannot to
share pg_class and pg_attribute - although some memory entries can be used
too.

Regards

Pavel

Show quoted text

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#9Zhaomo Yang
zhy001@cs.ucsd.edu
In reply to: Andrew Dunstan (#7)

Sounds fine in general. I'm a bit curious to know what are the locking

implications of > vivifying the table on access.

The locking implications depend on how we interpret the existing commands
in the context of global temp tables and I think we should discuss and
agree on the behaviors of the commands with global temp tables, but I think
in general we can follow these rules:

If the command executes on the global temp table's metadata, for example an
ALTER TABLE command, then we lock the global copy at the same level as we
do a regular table.

If the command executes on the global temp table's data (which is actually
stored in the session copy), for example an DML command, then the global
copy is locked at the AccessShareLock level to prevent concurrent
modifications to the global temp table's definition from other sessions.

Thanks,
Zhaomo

On Wed, Jul 15, 2015 at 4:26 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Show quoted text

On 07/15/2015 07:58 AM, Simon Riggs wrote:

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table
but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if
it does not exist we create it as a TEMP table of the same name, using the
Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility problems
or need for new internals.

The purpose of this feature is to automatically create a temp table with
the same definition whenever needed. The discussion of "bloat" is just
wrong. We create exactly the same amount of bloat as if we had typed CREATE
TEMP TABLE. Optimising temp table entries in the catalog is another,
separate patch, if we care.

Sounds fine in general. I'm a bit curious to know what are the locking
implications of vivifying the table on access.

cheers

andrew

#10Pavel Stehule
pavel.stehule@gmail.com
In reply to: Zhaomo Yang (#9)

2015-07-15 15:21 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:

Sounds fine in general. I'm a bit curious to know what are the locking

implications of > vivifying the table on access.

The locking implications depend on how we interpret the existing commands
in the context of global temp tables and I think we should discuss and
agree on the behaviors of the commands with global temp tables, but I think
in general we can follow these rules:

If the command executes on the global temp table's metadata, for example
an ALTER TABLE command, then we lock the global copy at the same level as
we do a regular table.

there is other question - what is effect of ALTER TABLE of global temp
table on instances of this table in active sessions?

Show quoted text

If the command executes on the global temp table's data (which is actually
stored in the session copy), for example an DML command, then the global
copy is locked at the AccessShareLock level to prevent concurrent
modifications to the global temp table's definition from other sessions.

Thanks,
Zhaomo

On Wed, Jul 15, 2015 at 4:26 AM, Andrew Dunstan <andrew@dunslane.net>
wrote:

On 07/15/2015 07:58 AM, Simon Riggs wrote:

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table
but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if
it does not exist we create it as a TEMP table of the same name, using the
Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility problems
or need for new internals.

The purpose of this feature is to automatically create a temp table with
the same definition whenever needed. The discussion of "bloat" is just
wrong. We create exactly the same amount of bloat as if we had typed CREATE
TEMP TABLE. Optimising temp table entries in the catalog is another,
separate patch, if we care.

Sounds fine in general. I'm a bit curious to know what are the locking
implications of vivifying the table on access.

cheers

andrew

#11Zhaomo Yang
zhy001@cs.ucsd.edu
In reply to: Pavel Stehule (#10)

there is other question - what is effect of ALTER TABLE of global temp

table on

instances of this table in active sessions?

As I said, we need to first agree on the behaviors of the existing
commands. I can think of two options now for ALTER TABLE: 1) only allow
ALTER TABLE when there is no other active sessions (this is how Oracle
deals with it.) 2) handle it as if session copies inherit from the global
copy and ALTER TABLE executes on the global copy.

Thanks,
Zhaomo

#12Simon Riggs
simon@2ndQuadrant.com
In reply to: Andrew Dunstan (#7)

On 15 July 2015 at 13:26, Andrew Dunstan <andrew@dunslane.net> wrote:

On 07/15/2015 07:58 AM, Simon Riggs wrote:

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table
but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if
it does not exist we create it as a TEMP table of the same name, using the
Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility problems
or need for new internals.

The purpose of this feature is to automatically create a temp table with
the same definition whenever needed. The discussion of "bloat" is just
wrong. We create exactly the same amount of bloat as if we had typed CREATE
TEMP TABLE. Optimising temp table entries in the catalog is another,
separate patch, if we care.

Sounds fine in general. I'm a bit curious to know what are the locking
implications of vivifying the table on access.

We would lock the Global Temp Table at the same lock level for the
activity, just as we do for INSERT, SELECT etc.. That prevents concurrent
DDL like DROP or ALTER on the Global Temp Table.

The Local temp table created is a copy of the Global Temp Table. The Local
temp table is only locally locked, so no worries.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#13Pavel Stehule
pavel.stehule@gmail.com
In reply to: Zhaomo Yang (#11)

2015-07-15 15:53 GMT+02:00 Zhaomo Yang <zhy001@cs.ucsd.edu>:

there is other question - what is effect of ALTER TABLE of global temp

table on

instances of this table in active sessions?

As I said, we need to first agree on the behaviors of the existing
commands. I can think of two options now for ALTER TABLE: 1) only allow
ALTER TABLE when there is no other active sessions (this is how Oracle
deals with it.) 2) handle it as if session copies inherit from the global
copy and ALTER TABLE executes on the global copy.

There are two possible kinds of GLOBAL TEMP tables - session related and
transation related. Transaction related tables has very short life - and @1
needs outage, @2 requires stronger locks and can slow and less effective -
because a) some changes can be invisible in other transactions (depends on
isolation levels), b) the structure can be changed, but function code not
(without dependency on isolation levels) - so it can be non consistent, c)
why to change table if this table will be dropped in next milisecond. For
this case the behave like PL functions can be very practical ~ third option
for ALTER TABLE

Regards

Pavel

Show quoted text

Thanks,
Zhaomo

#14Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#6)

On 2015-07-15 12:58:51 +0100, Simon Riggs wrote:

On 14 July 2015 at 23:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
Pavel's original description of how to do this seem valid, and from the
link Tom agreed in 2009.

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but
with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it
does not exist we create it as a TEMP table of the same name, using the
Global's pg_class entry as a template

Why do we need to create that copy? We can just use the relfilenode in
all backends by having the backendid in the filename? Yes, there's a
some amount of additional code needed, but it's not that much? I
actually think it might end up being less additional code than having a
copy, because with the copy you'll have two different oids for global
entry and the local copy.

Regards,

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#14)

On 2015-07-15 16:52:49 +0200, Andres Freund wrote:

Why do we need to create that copy? We can just use the relfilenode in
all backends by having the backendid in the filename? Yes, there's a
some amount of additional code needed, but it's not that much? I
actually think it might end up being less additional code than having a
copy, because with the copy you'll have two different oids for global
entry and the local copy.

Hm, yes. Brainfart. Transaction table rewrites/truncations need to
change the relfilenode.

To fix We could add a backend local mapping table from global temp table
id to the backend local relfilenode. The code to lookup the relfilenode
is already mostly isolated.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#15)

On 15 July 2015 at 15:57, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-15 16:52:49 +0200, Andres Freund wrote:

Why do we need to create that copy? We can just use the relfilenode in
all backends by having the backendid in the filename? Yes, there's a
some amount of additional code needed, but it's not that much? I
actually think it might end up being less additional code than having a
copy, because with the copy you'll have two different oids for global
entry and the local copy.

Hm, yes. Brainfart. Transaction table rewrites/truncations need to
change the relfilenode.

To fix We could add a backend local mapping table from global temp table
id to the backend local relfilenode. The code to lookup the relfilenode
is already mostly isolated.

It may be possible to do this, though I'm sure there's a wrinkle somewhere.
But there doesn't seem to be a need to overload the main feature request
with additional requirements. Doing that is just scope creep that prevents
us getting features out. Nice, simple patches from newer developers. Later
tuning and tweaking from more expert community members.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#17Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#16)

On 2015-07-15 16:24:52 +0100, Simon Riggs wrote:

It may be possible to do this, though I'm sure there's a wrinkle somewhere.
But there doesn't seem to be a need to overload the main feature request
with additional requirements. Doing that is just scope creep that prevents
us getting features out. Nice, simple patches from newer developers. Later
tuning and tweaking from more expert community members.

I think that's generally a fair point. But here we're discussing to add
a fair amount of wrinkles with the copy approach. The fact alone that
the oid is different will have some ugly consequences.

So we add complexity, just to shift it into different places later? I'm
not sure that's a good idea.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#17)

On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-15 16:24:52 +0100, Simon Riggs wrote:

It may be possible to do this, though I'm sure there's a wrinkle

somewhere.

But there doesn't seem to be a need to overload the main feature request
with additional requirements. Doing that is just scope creep that

prevents

us getting features out. Nice, simple patches from newer developers.

Later

tuning and tweaking from more expert community members.

I think that's generally a fair point. But here we're discussing to add
a fair amount of wrinkles with the copy approach. The fact alone that
the oid is different will have some ugly consequences.

Why? We are creating a local temp table LIKE the global temp table. That is
already a supported operation. So there is no "different oid".

So we add complexity, just to shift it into different places later? I'm
not sure that's a good idea.

There's no complexity in a simple temp table like. We can do this now with
triggers.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#17)

Andres Freund <andres@anarazel.de> writes:

On 2015-07-15 16:24:52 +0100, Simon Riggs wrote:

It may be possible to do this, though I'm sure there's a wrinkle somewhere.
But there doesn't seem to be a need to overload the main feature request
with additional requirements. Doing that is just scope creep that prevents
us getting features out. Nice, simple patches from newer developers. Later
tuning and tweaking from more expert community members.

I think that's generally a fair point. But here we're discussing to add
a fair amount of wrinkles with the copy approach. The fact alone that
the oid is different will have some ugly consequences.

So we add complexity, just to shift it into different places later? I'm
not sure that's a good idea.

With all due respect, there are features that are beyond the abilities of
some "newer developers", and reducing the scope isn't a good way to fix
that. It just leaves a bigger mess to be cleaned up later.

I think Andres' idea of a per-backend filenode mapping table might work.
The existing relfilenode mapper solves a somewhat related problem, namely
how do you replace the filenode for shared system catalogs whose pg_class
entries can't be changed.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#18)

Simon Riggs <simon@2ndQuadrant.com> writes:

On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:

I think that's generally a fair point. But here we're discussing to add
a fair amount of wrinkles with the copy approach. The fact alone that
the oid is different will have some ugly consequences.

Why? We are creating a local temp table LIKE the global temp table. That is
already a supported operation. So there is no "different oid".

You're presuming a specific implementation decision, one that has not been
made yet, and isn't all that attractive because of the catalog bloat issues.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#18)

On 2015-07-15 16:36:12 +0100, Simon Riggs wrote:

On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:

I think that's generally a fair point. But here we're discussing to add
a fair amount of wrinkles with the copy approach. The fact alone that
the oid is different will have some ugly consequences.

Why? We are creating a local temp table LIKE the global temp table. That is
already a supported operation. So there is no "different oid".

Then your locking against ALTER, DROP etc. isn't going to work.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#21)

On 15 July 2015 at 16:44, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-15 16:36:12 +0100, Simon Riggs wrote:

On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:

I think that's generally a fair point. But here we're discussing to add
a fair amount of wrinkles with the copy approach. The fact alone that
the oid is different will have some ugly consequences.

Why? We are creating a local temp table LIKE the global temp table. That

is

already a supported operation. So there is no "different oid".

Then your locking against ALTER, DROP etc. isn't going to work.

There would be two objects, both locked. The temp table is just nice and
simple. No problem.

Your optimization may work; I hope it does. My approach definitely will. So
we could choose either.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#23Josh Berkus
josh@agliodbs.com
In reply to: Zhaomo Yang (#1)

Pavel, All:

Just to be clear, the idea of a global temp table is that the table def
is available to all users, but the data is private to each session?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Zhaomo Yang
zhy001@cs.ucsd.edu
In reply to: Josh Berkus (#23)

Just to be clear, the idea of a global temp table is that the table def
is available to all users, but the data is private to each session?

The table def is visible to all sessions and persistent, but the data is
private to each session and temporary.

Thanks,
Zhaomo

#25Pavel Stehule
pavel.stehule@gmail.com
In reply to: Josh Berkus (#23)

2015-07-19 21:39 GMT+02:00 Josh Berkus <josh@agliodbs.com>:

Pavel, All:

Just to be clear, the idea of a global temp table is that the table def
is available to all users, but the data is private to each session?

yes.

Pavel

Show quoted text

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#26Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Pavel Stehule (#25)

On 20/07/15 15:00, Pavel Stehule wrote:

2015-07-19 21:39 GMT+02:00 Josh Berkus <josh@agliodbs.com
<mailto:josh@agliodbs.com>>:

Pavel, All:

Just to be clear, the idea of a global temp table is that the
table def
is available to all users, but the data is private to each session?

yes.

Pavel

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Just wondering...

Would it be difficult to add the ability for one user to share the
contents with a list of named other users (roles)?

-Gavin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27Pavel Stehule
pavel.stehule@gmail.com
In reply to: Gavin Flower (#26)

2015-07-20 5:33 GMT+02:00 Gavin Flower <GavinFlower@archidevsys.co.nz>:

On 20/07/15 15:00, Pavel Stehule wrote:

2015-07-19 21:39 GMT+02:00 Josh Berkus <josh@agliodbs.com <mailto:
josh@agliodbs.com>>:

Pavel, All:

Just to be clear, the idea of a global temp table is that the
table def
is available to all users, but the data is private to each session?

yes.

Pavel

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Just wondering...

Would it be difficult to add the ability for one user to share the
contents with a list of named other users (roles)?

Probably it is possible, but not for temporary data - short data are in
process memory, so it are not accessible from other sessions.

This sharing tables needs:

1. some infrastructure to hold data about sharing - who can share with what
2. who will clean data? temporary data are cleaned on end of transaction or
end of session
3. data should be saved in shared memory instead process memory

So it is possible, but partially different

Show quoted text

-Gavin

#28Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Pavel Stehule (#27)

Pavel Stehule wrote:

2015-07-20 5:33 GMT+02:00 Gavin Flower <GavinFlower@archidevsys.co.nz>:

Would it be difficult to add the ability for one user to share the
contents with a list of named other users (roles)?

Probably it is possible, but not for temporary data - short data are in
process memory, so it are not accessible from other sessions.

This sharing tables needs:

1. some infrastructure to hold data about sharing - who can share with what
2. who will clean data? temporary data are cleaned on end of transaction or
end of session
3. data should be saved in shared memory instead process memory

So it is possible, but partially different

To me this gets in the "crazy ideas" list. Please add it to the TODO
page in the wiki, so that we're sure we never implement it.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29Andres Freund
andres@anarazel.de
In reply to: Gavin Flower (#26)

On 2015-07-20 15:33:32 +1200, Gavin Flower wrote:

Would it be difficult to add the ability for one user to share the contents
with a list of named other users (roles)?

No need. That feature is called unlogged tables and grants.

Doing this for temporary tables would be horrible. They live in process
local memory and not shared memory. Because that provides higher
isolation, not even though it does.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30Pavel Stehule
pavel.stehule@gmail.com
In reply to: Alvaro Herrera (#28)

2015-07-20 11:07 GMT+02:00 Alvaro Herrera <alvherre@2ndquadrant.com>:

Pavel Stehule wrote:

2015-07-20 5:33 GMT+02:00 Gavin Flower <GavinFlower@archidevsys.co.nz>:

Would it be difficult to add the ability for one user to share the
contents with a list of named other users (roles)?

Probably it is possible, but not for temporary data - short data are in
process memory, so it are not accessible from other sessions.

This sharing tables needs:

1. some infrastructure to hold data about sharing - who can share with

what

2. who will clean data? temporary data are cleaned on end of transaction

or

end of session
3. data should be saved in shared memory instead process memory

So it is possible, but partially different

To me this gets in the "crazy ideas" list. Please add it to the TODO
page in the wiki, so that we're sure we never implement it.

yes, it is pretty crazy - Have no plan to implement it :)

Pavel

Show quoted text

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#31Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#22)

On Wed, Jul 15, 2015 at 11:52 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 15 July 2015 at 16:44, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-15 16:36:12 +0100, Simon Riggs wrote:

On 15 July 2015 at 16:28, Andres Freund <andres@anarazel.de> wrote:

I think that's generally a fair point. But here we're discussing to
add
a fair amount of wrinkles with the copy approach. The fact alone that
the oid is different will have some ugly consequences.

Why? We are creating a local temp table LIKE the global temp table. That
is
already a supported operation. So there is no "different oid".

Then your locking against ALTER, DROP etc. isn't going to work.

There would be two objects, both locked. The temp table is just nice and
simple. No problem.

Your optimization may work; I hope it does. My approach definitely will. So
we could choose either.

It's not really an optimization; it's a whole different approach. I
looked at the create-a-temp-table-on-the-fly idea back when I
implemented unlogged tables and concluded it was an unworkable mess.
Deep down in the guts of name resolution code is not the place where
you want to suddenly decide that you need to run some DDL. So I
believe in what Andres is proposing. I'm not necessarily going to
shout it down if somebody finds a way to make the
temp-table-on-the-fly approach work, but my view is that making that
work, although it may look superficially appealing, will eventually
make whoever has to do it hate their life; and that even if they get
it to where it sorta works, it's going to have ugly corner cases that
are almost impossible to file down.

Another advantage of Andres's approach, BTW, is that it could
potentially eventually be extended to work on Hot Standby machines.
For that to work, we'd need a separate XID space for temporary tables,
but Noah proposed that before, and I don't think it's a completely
crazy idea (just mostly crazy). Now, maybe nobody's going to care
about that any more in 5 years if we have full-blown logical
replication deeply integrated into core, but there's a lot to like
about a design that keeps our options in that area open.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers