including backend ID in relpath of temp rels - updated patch

Started by Robert Haasalmost 16 years ago47 messageshackers
Jump to latest
#1Robert Haas
robertmhaas@gmail.com

For previous discussion of this topic, see:

http://archives.postgresql.org/pgsql-hackers/2010-04/msg01181.php
http://archives.postgresql.org/pgsql-hackers/2010-05/msg00352.php
http://archives.postgresql.org/pgsql-hackers/2010-06/msg00302.php

As in the original version of the patch, I have not simply added
backend ID to RelFileNode, because there are too many places using
RelFileNode in contexts where the backend ID can be determined from
context, such as the shared and local buffer managers and the xlog
code. Instead, I have introduced BackendRelFileNode for contexts
where we need both the RelFileNode and the backend ID. The smgr layer
has to use BackendRelFileNode across the board, since it operates on
both permanent and temporary relation, including - potentially -
temporary relations of other backends. smgr invalidations must also
include the backend ID, as must communication between regular backends
and the bgwriter. The relcache now stores rd_backend instead of
rd_islocaltemp so that it remains straightforward to call smgropen()
based on a relcache entry. Some smgr functions no longer require an
isTemp argument, because they can infer the necessary information from
their BackendRelFileNode. smgrwrite() and smgrextend() now take a
skipFsync argument rather than an isTemp argument.

In this version of the patch, I've improved the temporary file cleanup
mechanism. In CVS HEAD, when a transaction commits or aborts, we
write an XLOG record with all relations that must be unlinked,
including temporary relations. With this patch, we no longer include
temporary relations in the XLOG records (which may be a tiny
performance benefit for some people); instead, on every startup of the
database cluster, we just nuke all temporary relation files (which is
now easy to do, since they are named differently than files for
non-temporary relations) at the same time that we nuke other temp
files. This produces slightly different behavior. In CVS HEAD,
temporary files get removed whenever an xlog redo happens, so either
at cluster start or after a backend crash, but only to the extent that
they appear in WAL. With this patch, we can be sure of removing ALL
stray files, which is maybe ever-so-slightly leaky in CVS HEAD. But
on the other hand, because it hooks into the existing temporary file
cleanup code, it only happens at cluster startup, NOT after a backend
crash. The existing coding leaves temporary sort files and similar
around after a backend crash for forensics purposes. That might or
might not be worth rethinking for non-debug builds, but I don't think
there's any very good reason to think that temporary relation files
need to be handled differently than temporary work files.

I believe that this patch will clear away one major obstacle to
implementing global temporary and unlogged tables: it enables us to be
sure of cleaning up properly after a crash without relying on catalog
entries or XLOG. Based on previous discussions, however, I believe
there is support for making this change independently of what becomes
of that project, just for the benefit of having a more robust cleanup
mechanism.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Attachments:

temprelnames-v2.patchapplication/octet-stream; name=temprelnames-v2.patchDownload+556-264
#2Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#1)
Re: including backend ID in relpath of temp rels - updated patch

On Thu, Jun 10, 2010 at 4:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:

For previous discussion of this topic, see:

http://archives.postgresql.org/pgsql-hackers/2010-04/msg01181.php
http://archives.postgresql.org/pgsql-hackers/2010-05/msg00352.php
http://archives.postgresql.org/pgsql-hackers/2010-06/msg00302.php

As in the original version of the patch, I have not simply added
backend ID to RelFileNode, because there are too many places using
RelFileNode in contexts where the backend ID can be determined from
context, such as the shared and local buffer managers and the xlog
code.  Instead, I have introduced BackendRelFileNode for contexts
where we need both the RelFileNode and the backend ID.  The smgr layer
has to use BackendRelFileNode across the board, since it operates on
both permanent and temporary relation, including - potentially -
temporary relations of other backends.  smgr invalidations must also
include the backend ID, as must communication between regular backends
and the bgwriter.  The relcache now stores rd_backend instead of
rd_islocaltemp so that  it remains straightforward to call smgropen()
based on a relcache entry. Some smgr functions no longer require an
isTemp argument, because they can infer the necessary information from
their BackendRelFileNode.  smgrwrite() and smgrextend() now take a
skipFsync argument rather than an isTemp argument.

In this version of the patch, I've improved the temporary file cleanup
mechanism.  In CVS HEAD, when a transaction commits or aborts, we
write an XLOG record with all relations that must be unlinked,
including temporary relations.  With this patch, we no longer include
temporary relations in the XLOG records (which may be a tiny
performance benefit for some people); instead, on every startup of the
database cluster, we just nuke all temporary relation files (which is
now easy to do, since they are named differently than files for
non-temporary relations) at the same time that we nuke other temp
files.  This produces slightly different behavior.  In CVS HEAD,
temporary files get removed whenever an xlog redo happens, so either
at cluster start or after a backend crash, but only to the extent that
they appear in WAL.  With this patch, we can be sure of removing ALL
stray files, which is maybe ever-so-slightly leaky in CVS HEAD.  But
on the other hand, because it hooks into the existing temporary file
cleanup code, it only happens at cluster startup, NOT after a backend
crash.  The existing coding leaves temporary sort files and similar
around after a backend crash for forensics purposes.  That might or
might not be worth rethinking for non-debug builds, but I don't think
there's any very good reason to think that temporary relation files
need to be handled differently than temporary work files.

I believe that this patch will clear away one major obstacle to
implementing global temporary and unlogged tables: it enables us to be
sure of cleaning up properly after a crash without relying on catalog
entries or XLOG.  Based on previous discussions, however, I believe
there is support for making this change independently of what becomes
of that project, just for the benefit of having a more robust cleanup
mechanism.

Updated patch to remove minor bitrot.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Attachments:

temprelnames-v3.patchapplication/octet-stream; name=temprelnames-v3.patchDownload+556-264
#3Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Robert Haas (#1)
Re: including backend ID in relpath of temp rels - updated patch

On Thu, Jun 10, 2010 at 3:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I believe that this patch will clear away one major obstacle to
implementing global temporary and unlogged tables: it enables us to be
sure of cleaning up properly after a crash without relying on catalog
entries or XLOG.  Based on previous discussions, however, I believe
there is support for making this change independently of what becomes
of that project, just for the benefit of having a more robust cleanup
mechanism.

Hi,

i was looking at v3 of this patch...

it looks good, works as advertised, pass regression tests, and all
tests i could think of...
haven't looked the code at much detail but seems ok, to me at least...

--
Jaime Casanova         www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

#4Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Jaime Casanova (#3)
Re: including backend ID in relpath of temp rels - updated patch

On Fri, Jul 23, 2010 at 10:05 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote:

On Thu, Jun 10, 2010 at 3:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I believe that this patch will clear away one major obstacle to
implementing global temporary and unlogged tables: it enables us to be
sure of cleaning up properly after a crash without relying on catalog
entries or XLOG.  Based on previous discussions, however, I believe
there is support for making this change independently of what becomes
of that project, just for the benefit of having a more robust cleanup
mechanism.

Hi,

i was looking at v3 of this patch...

Ok, i like what you did in smgrextend, smgrwrite, and others...
changing isTemp for skipFsync is more descriptive

but i have a few questions, maybe is right what you did i only want to
understand it:
- you added this in include/storage/smgr.h, so why is safe to assume
that if the backend != InvalidBackendId it must be a temp relation?

+#define SmgrIsTemp(smgr) \
+   ((smgr)->smgr_rnode.backend != InvalidBackendId)

- you added a question like this "if (rel->rd_backend == MyBackendId)"
in a few places... why are you assuming that? that couldn't be a new
created relation (in current session of course)? is that safe?

--
Jaime Casanova         www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

#5Robert Haas
robertmhaas@gmail.com
In reply to: Jaime Casanova (#4)
Re: including backend ID in relpath of temp rels - updated patch

On Sun, Jul 25, 2010 at 2:37 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote:

but i have a few questions, maybe is right what you did i only want to
understand it:
- you added this in include/storage/smgr.h, so why is safe to assume
that if the backend != InvalidBackendId it must be a temp relation?

+#define SmgrIsTemp(smgr) \
+   ((smgr)->smgr_rnode.backend != InvalidBackendId)

That's pretty much the whole point of the patch. Instead of
identifying relations as simply "temporary" or "not temporary", we
identify as "a temporary relation owned by backend X" or as "not
temporary".

- you added a question like this "if (rel->rd_backend == MyBackendId)"
in a few places... why are you assuming that? that couldn't be a new
created relation (in current session of course)? is that safe?

Again, rd_backend is not the creating backend ID unless the relation
is a temprel.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#6Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Robert Haas (#5)
Re: including backend ID in relpath of temp rels - updated patch

On Sun, Jul 25, 2010 at 4:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Jul 25, 2010 at 2:37 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote:

but i have a few questions, maybe is right what you did i only want to
understand it:
- you added this in include/storage/smgr.h, so why is safe to assume
that if the backend != InvalidBackendId it must be a temp relation?

+#define SmgrIsTemp(smgr) \
+   ((smgr)->smgr_rnode.backend != InvalidBackendId)

That's pretty much the whole point of the patch.  Instead of
identifying relations as simply "temporary" or "not temporary", we
identify as "a temporary relation owned by backend X" or as "not
temporary".

Ok, this one seems good enough... i'm marking it as "ready for committer"

--
Jaime Casanova         www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#2)
Re: including backend ID in relpath of temp rels - updated patch

Robert Haas <robertmhaas@gmail.com> writes:

[ BackendRelFileNode patch ]

One thing that I find rather distressing about this is the 25% bloat
in sizeof(SharedInvalidationMessage). Couldn't we avoid that? Is it
really necessary to *ever* send an SI message for a backend-local rel?

I agree that one needs to send relcache inval sometimes for temp rels,
but I don't see why each backend couldn't interpret that as a flush
on its own local version.

regards, tom lane

#8Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#7)
Re: including backend ID in relpath of temp rels - updated patch

On Thu, Aug 5, 2010 at 7:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

[ BackendRelFileNode patch ]

One thing that I find rather distressing about this is the 25% bloat
in sizeof(SharedInvalidationMessage).  Couldn't we avoid that?  Is it
really necessary to *ever* send an SI message for a backend-local rel?

It can be dropped or unlinked by another backend, so, yes.

I agree that one needs to send relcache inval sometimes for temp rels,
but I don't see why each backend couldn't interpret that as a flush
on its own local version.

The problem is that receipt of the inval message results in a call to
smgrclosenode(), which has to look up the (Backend)RelFileNode in
SMgrRelationHash. If the backend ID isn't present, we can't search
the hash table efficiently. This is not only a problem for
smgrclosenode(), either; that hash is used in a bunch of places, so
changing the hash key isn't really an option. One possibility would
be to have two separate shared-invalidation message types for smgr.
One would indicate that a non-temporary relation was flushed, so the
backend ID field is known to be -1 and we can search the hash quickly.
The other would indicate that a temporary relation was flushed and
you'd have to scan the whole hash table to find and flush all matching
entries. That still doesn't sound great, though. Another thought I
had was that if we could count on the backend ID to fit into an int16
we could fit it in to what's currently padding space. That would
require rather dramatically lowering the maximum number of backends
(currently INT_MAX/4), but it's a little hard to imagine that we can
really support more than 32,767 simultaneous backends anyway.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#8)
Re: including backend ID in relpath of temp rels - updated patch

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Aug 5, 2010 at 7:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

One thing that I find rather distressing about this is the 25% bloat
in sizeof(SharedInvalidationMessage). �Couldn't we avoid that? �Is it
really necessary to *ever* send an SI message for a backend-local rel?

It can be dropped or unlinked by another backend, so, yes.

Really? Surely that should be illegal during normal operation. We
might be doing such during crash recovery, but we don't need to
broadcast sinval messages then.

It might be sufficient to consider that there are "local" and "global"
smgr inval messages, where the former never get out of the generating
backend, so a bool is enough in the message struct.

had was that if we could count on the backend ID to fit into an int16
we could fit it in to what's currently padding space. That would
require rather dramatically lowering the maximum number of backends
(currently INT_MAX/4), but it's a little hard to imagine that we can
really support more than 32,767 simultaneous backends anyway.

Yeah, that occurred to me too. A further thought is that the id field
could probably be reduced to 1 byte, leaving 3 for backendid, which
would certainly be plenty. However representing that in a portable
struct declaration would be a bit painful I fear.

regards, tom lane

#10Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#9)
Re: including backend ID in relpath of temp rels - updated patch

On Fri, Aug 6, 2010 at 1:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Aug 5, 2010 at 7:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

One thing that I find rather distressing about this is the 25% bloat
in sizeof(SharedInvalidationMessage).  Couldn't we avoid that?  Is it
really necessary to *ever* send an SI message for a backend-local rel?

It can be dropped or unlinked by another backend, so, yes.

Really?  Surely that should be illegal during normal operation. We
might be doing such during crash recovery, but we don't need to
broadcast sinval messages then.

autovacuum.c does it when we start to worry about XID wraparound, but
you can also do it from any normal backend. Just "DROP TABLE
pg_temp_2.foo" or whatever and away you go.

It might be sufficient to consider that there are "local" and "global"
smgr inval messages, where the former never get out of the generating
backend, so a bool is enough in the message struct.

It would be nice to be able to do it that way, but I don't believe
it's the case, per the above.

had was that if we could count on the backend ID to fit into an int16
we could fit it in to what's currently padding space.  That would
require rather dramatically lowering the maximum number of backends
(currently INT_MAX/4), but it's a little hard to imagine that we can
really support more than 32,767 simultaneous backends anyway.

Yeah, that occurred to me too.  A further thought is that the id field
could probably be reduced to 1 byte, leaving 3 for backendid, which
would certainly be plenty.  However representing that in a portable
struct declaration would be a bit painful I fear.

Well, presumably we'd just represent it as a 1-byte field followed by
a 2-byte field, and do a bit of math. But I don't really see the
point. The whole architecture of a shared invalidation queue is
fundamentally non-scalable because it's a broadcast medium. If we
wanted to efficiently support even thousands of backends (let alone
tens or hundreds of thousands) I assume we would need to rearchitect
this completely with more fine-grained queues, and have backends
subscribe to the queues pertaining to the objects they want to access
before touching them. Or maybe something else entirely. But I don't
think broadcasting to 30,000 backends is going to work for the same
reason that plugging 30,000 machines into an Ethernet *hub* doesn't
work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#10)
Re: including backend ID in relpath of temp rels - updated patch

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Aug 6, 2010 at 1:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Really? �Surely that should be illegal during normal operation. We
might be doing such during crash recovery, but we don't need to
broadcast sinval messages then.

autovacuum.c does it when we start to worry about XID wraparound, but
you can also do it from any normal backend. Just "DROP TABLE
pg_temp_2.foo" or whatever and away you go.

Mph. I'm still not convinced that the sinval message needs to carry
backendid. But maybe the first-cut solution should just be to squeeze
the id into the padding area. You should be able to get up to 65535
allowed backends, not 32k --- or perhaps use different message type IDs
for local and global backendid, so that all 65536 bitpatterns are
allowed for a non-global backendid.

Well, presumably we'd just represent it as a 1-byte field followed by
a 2-byte field, and do a bit of math. But I don't really see the
point. The whole architecture of a shared invalidation queue is
fundamentally non-scalable because it's a broadcast medium.

Sure, it tops out somewhere, but 32K is way too close to configurations
we know work well enough in the field (I've seen multiple reports of
people using a couple thousand backends). In any case, sinval readers
don't block each other in the current implementation, so I'm really
dubious that there's any inherent scalability limitation there. I'll
hold still for 64K but I think it might be better to go for 2^24.

regards, tom lane

#12Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Robert Haas (#10)
Re: including backend ID in relpath of temp rels - updated patch

On Fri, Aug 6, 2010 at 12:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:

 Just "DROP TABLE
pg_temp_2.foo" or whatever and away you go.

wow! that's true... and certainly a bug...
we shouldn't allow any session to drop other session's temp tables, or
is there a reason for this misbehavior?

--
Jaime Casanova         www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jaime Casanova (#12)
Re: including backend ID in relpath of temp rels - updated patch

Jaime Casanova <jaime@2ndquadrant.com> writes:

On Fri, Aug 6, 2010 at 12:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Just "DROP TABLE pg_temp_2.foo" or whatever and away you go.

wow! that's true... and certainly a bug...

No, it's not a bug. You'll find only superusers can do it.

we shouldn't allow any session to drop other session's temp tables, or
is there a reason for this misbehavior?

It is intentional, though I'd be willing to give it up if we had more
bulletproof crash-cleanup of temp tables --- which is one of the things
this patch is supposed to result in.

regards, tom lane

#14Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#11)
Re: including backend ID in relpath of temp rels - updated patch

On Fri, Aug 6, 2010 at 2:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Aug 6, 2010 at 1:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Really?  Surely that should be illegal during normal operation. We
might be doing such during crash recovery, but we don't need to
broadcast sinval messages then.

autovacuum.c does it when we start to worry about XID wraparound, but
you can also do it from any normal backend.  Just "DROP TABLE
pg_temp_2.foo" or whatever and away you go.

Mph.  I'm still not convinced that the sinval message needs to carry
backendid.

Hey, if you have an idea... I'd love to get rid of it, too, but I
don't see how to do it.

But maybe the first-cut solution should just be to squeeze
the id into the padding area.  You should be able to get up to 65535
allowed backends, not 32k --- or perhaps use different message type IDs
for local and global backendid, so that all 65536 bitpatterns are
allowed for a non-global backendid.

Well, presumably we'd just represent it as a 1-byte field followed by
a 2-byte field, and do a bit of math.  But I don't really see the
point.  The whole architecture of a shared invalidation queue is
fundamentally non-scalable because it's a broadcast medium.

Sure, it tops out somewhere, but 32K is way too close to configurations
we know work well enough in the field (I've seen multiple reports of
people using a couple thousand backends).  In any case, sinval readers
don't block each other in the current implementation, so I'm really
dubious that there's any inherent scalability limitation there.  I'll
hold still for 64K but I think it might be better to go for 2^24.

Well, I wouldn't expect anyone to use an exclusive lock for readers
without a good reason, but you still have n backends that each have to
read, presumably, about O(n) messages, so eventually that's going to
start to pinch.

Do you think it's worth worrying about the reduction in the number of
possible SI message types?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#15Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#13)
Re: including backend ID in relpath of temp rels - updated patch

On Fri, Aug 6, 2010 at 2:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jaime Casanova <jaime@2ndquadrant.com> writes:

On Fri, Aug 6, 2010 at 12:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Just "DROP TABLE pg_temp_2.foo" or whatever and away you go.

wow! that's true... and certainly a bug...

No, it's not a bug.  You'll find only superusers can do it.

we shouldn't allow any session to drop other session's temp tables, or
is there a reason for this misbehavior?

It is intentional, though I'd be willing to give it up if we had more
bulletproof crash-cleanup of temp tables --- which is one of the things
this patch is supposed to result in.

This patch only directly addresses the issue of cleaning up the
storage, so there are still the catalog entries to worry about. But
it doesn't seem impossible to think about building on this approach to
eventually handle that part of the problem better, too. I haven't
thought too much about what that would look like, but I think it could
be done.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#14)
Re: including backend ID in relpath of temp rels - updated patch

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Aug 6, 2010 at 2:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Sure, it tops out somewhere, but 32K is way too close to configurations
we know work well enough in the field (I've seen multiple reports of
people using a couple thousand backends).

Well, I wouldn't expect anyone to use an exclusive lock for readers
without a good reason, but you still have n backends that each have to
read, presumably, about O(n) messages, so eventually that's going to
start to pinch.

Sure, but I don't see much to be gained from multiple queues either.
There are few (order of zero, in fact) cases where sinval messages
are transmitted that aren't of potential interest to all backends.
Maybe you could do something useful with a very large number of
dynamically-defined queues (like one per relation) ... but managing that
would probably swamp any savings.

Do you think it's worth worrying about the reduction in the number of
possible SI message types?

IIRC the number of message types is the number of catalog caches plus
half a dozen or so. We're a long way from exhausting even a 1-byte
ID field; and we could play more games if we had to, since there would
be a padding byte free in the message types that refer to a catalog
cache. IOW, 1-byte id doesn't bother me.

regards, tom lane

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#15)
Re: including backend ID in relpath of temp rels - updated patch

Robert Haas <robertmhaas@gmail.com> writes:

This patch only directly addresses the issue of cleaning up the
storage, so there are still the catalog entries to worry about. But
it doesn't seem impossible to think about building on this approach to
eventually handle that part of the problem better, too. I haven't
thought too much about what that would look like, but I think it could
be done.

Perhaps run through pg_class after restart and flush anything marked
relistemp? Although the ideal solution, probably, would be for temp
tables to not have persistent catalog entries in the first place.

regards, tom lane

#18David Fetter
david@fetter.org
In reply to: Tom Lane (#17)
Re: including backend ID in relpath of temp rels - updated patch

On Fri, Aug 06, 2010 at 03:00:35PM -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

This patch only directly addresses the issue of cleaning up the
storage, so there are still the catalog entries to worry about.
But it doesn't seem impossible to think about building on this
approach to eventually handle that part of the problem better,
too. I haven't thought too much about what that would look like,
but I think it could be done.

Perhaps run through pg_class after restart and flush anything marked
relistemp? Although the ideal solution, probably, would be for temp
tables to not have persistent catalog entries in the first place.

For the upcoming global temp tables, which are visible in other
sessions, how would this work?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Fetter (#18)
Re: including backend ID in relpath of temp rels - updated patch

David Fetter <david@fetter.org> writes:

On Fri, Aug 06, 2010 at 03:00:35PM -0400, Tom Lane wrote:

Perhaps run through pg_class after restart and flush anything marked
relistemp? Although the ideal solution, probably, would be for temp
tables to not have persistent catalog entries in the first place.

For the upcoming global temp tables, which are visible in other
sessions, how would this work?

Well, that's a totally different matter. Those would certainly have
persistent entries, at least for the "master" version. I don't think
anyone's really figured out what the best implementation looks like;
but maybe there would be per-backend "child" versions that could act
just like the current kind of temp table (and again would ideally not
have any persistent catalog entries).

regards, tom lane

#20Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#17)
Re: including backend ID in relpath of temp rels - updated patch

On Fri, Aug 6, 2010 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

This patch only directly addresses the issue of cleaning up the
storage, so there are still the catalog entries to worry about.  But
it doesn't seem impossible to think about building on this approach to
eventually handle that part of the problem better, too.  I haven't
thought too much about what that would look like, but I think it could
be done.

Perhaps run through pg_class after restart and flush anything marked
relistemp?

The trouble is that you have to bind to a database before you can run
through pg_class, and the postmaster doesn't. Of course, if it could
attach to a database and then detach again, this might be feasible,
although perhaps still a bit overly complex for the postmaster, but in
any event it doesn't. We seem to already have a mechanism in place
where a backend that creates a temprel nukes any pre-existing temprel
schema for its own backend, but of course that's not fool-proof.

Although the ideal solution, probably, would be for temp
tables to not have persistent catalog entries in the first place.

I've been thinking about that, but it's a bit challenging to imagine
how it could work. It's not just the pg_class entries you have to
think about, but also pg_attrdef, pg_attribute, pg_constraint,
pg_description, pg_index, pg_rewrite, and pg_trigger. An even
stickier problem is that we have lots of places in the backend code
that refer to objects by OID. We'd either need to change all of that
code (which seems like a non-starter) or somehow guarantee that the
OIDs assigned to any given backend's private objects would be
different from those assigned to any public object (which I also don't
see how to do).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#21Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#16)
#22Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#20)
#23Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#21)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#23)
#25David Fetter
david@fetter.org
In reply to: Tom Lane (#24)
#26Joshua D. Drake
jd@commandprompt.com
In reply to: David Fetter (#25)
#27David Fetter
david@fetter.org
In reply to: Joshua D. Drake (#26)
#28Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: David Fetter (#27)
#29Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#24)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#29)
#31Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#30)
#32Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#29)
#33Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#32)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#33)
#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#31)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#34)
#37Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#35)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#37)
#39Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#31)
#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#39)
#41Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#40)
#42Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#41)
#43Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#29)
#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#43)
#45Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#44)
#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#45)
#47Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#46)