Vacuuming leaked temp tables (once again)
This thread
http://archives.postgresql.org/pgsql-hackers/2008-01/msg00134.php
kind of wandered off into the weeds after identifying a semi-related
bug in CLUSTER, but the original problem still remains: if a backend
crashes after creating some temp tables, the tables remain present.
Such tables will get recycled next time someone reuses the same
pg_temp_NNN schema. But if the failed backend had been occupying an
unusually high-numbered BackendId slot, then its pg_temp_NNN schema
might go unused for a long time --- long enough for the temp tables to
pose an xid-wraparound problem. There's another report of this issue
today in pgsql-general.
The only solution proposed in that thread was to auto-delete temp
tables at postmaster restart; which I opposed on the grounds that
throwing away data right after a crash was a terrible idea from a
forensic standpoint. I still think that, but I had another idea
about how to cope with the situation. It's reasonably easy to
tell (by looking into the sinval state) whether a given BackendId
slot is actually in use, so we could detect whether a temp table
actually belongs to a live backend or not. What I'm thinking is
we should adjust autovacuum so that it will apply anti-wraparound
vacuuming operations even to temp tables, if they belong to pg_temp
schemas that belong to inactive BackendId slots. This'd fix the
wraparound issue without any risk of discarding data that someone
might want back.
Note that this should be safe even if someone claims the pg_temp_NNN
schema and tries to drop the old temp table while we're vacuuming it.
Operations on temp tables take the normal types of locks, so that
will get interlocked properly.
A small hole in this idea is that the BackendId slot might be occupied
by some new backend that actually hasn't created any temp tables yet
(hence not "taken possession" of the pg_temp_NNN schema). We could fix
that by making each backend's has-temp-tables state globally visible.
However, I'm inclined to think it's not really an issue, because you
wouldn't get into trouble unless this was always the case over many
repeated autovacuum visits to the table, which seems pretty improbable.
Another issue is that leftover temp tables would be significantly more
likely to be self-inconsistent than normal tables, since operations on
them are not WAL-logged and it's entirely likely that the owning backend
crashed with some dirty pages not written out from its local buffers.
AFAICS this shouldn't be any big problem for vacuuming the table proper,
since heap pages are pretty independent, at least at the level
understood by plain vacuum. There is a risk that indexes would be
corrupt enough to make vacuum error out, thus preventing the xid
wraparound cleanup from completing. But that leaves us no worse off
than we are now, and at least there would be signs of distress in the
postmaster log for the DBA to see.
Or we could have autovacuum just drop orphaned temp tables, *if*
they have gotten old enough to need anti-wraparound vacuuming.
While I'm still uncomfortable with having autovac drop anything,
at least this would avoid the worst cases of "gee I really needed
that data to investigate the crash". The main attractions of this
idea are avoiding the corrupt-index issue and not doing vacuuming
work that's 99.99% sure to be useless.
Thoughts?
regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes:
The main attractions of this idea are avoiding the corrupt-index issue and
not doing vacuuming work that's 99.99% sure to be useless.
It does seem strange to me to vacuum a table you're pretty sure is useless
*and* quite likely corrupt.
Could autovacuum emit log messages as soon as it sees such tables and start
dropping them at some point later?
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!
Tom Lane wrote:
Another issue is that leftover temp tables would be significantly more
likely to be self-inconsistent than normal tables, since operations on
them are not WAL-logged and it's entirely likely that the owning backend
crashed with some dirty pages not written out from its local buffers.
AFAICS this shouldn't be any big problem for vacuuming the table proper,
since heap pages are pretty independent, at least at the level
understood by plain vacuum.
There's the torn-page problem as well. Highly improbable, but it seems
possible to me to have an inconsistent heap page with for example broken
redirecting line pointers or something like that, that would cause
crashes or assertion failures on vacuum.
Or we could have autovacuum just drop orphaned temp tables, *if*
they have gotten old enough to need anti-wraparound vacuuming.
While I'm still uncomfortable with having autovac drop anything,
at least this would avoid the worst cases of "gee I really needed
that data to investigate the crash". The main attractions of this
idea are avoiding the corrupt-index issue and not doing vacuuming
work that's 99.99% sure to be useless.
That sounds a lot simpler and better to me.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes:
Could autovacuum emit log messages as soon as it sees such tables and start
dropping them at some point later?
We might have to rearrange the logic a bit to make that happen (I'm not
sure what order things get tested in), but a log message does seem like
a good idea. I'd go for logging anytime an orphaned table is seen,
and dropping once it's past the anti-wraparound horizon.
regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
The only solution proposed in that thread was to auto-delete temp
tables at postmaster restart; which I opposed on the grounds that
throwing away data right after a crash was a terrible idea from a
forensic standpoint.
Why not just rename the files out of the way, and nuke the entries from
the catalog? Something like "filename.crash" or similar that an admin
can have scripts in place to check for and who could then go handle as
appropriate (remove, investigate, etc). If there's data in the catalog
that you think might be bad to lose, then include it in some kind of
format in a "filename.crash.catalog" or similar file. Maybe also spit
out a warning or error or something on backend start when this rename is
done, and on subsequent starts if the file remains.
What I really don't like is keeping something that is likely useless and
possibly deadly to a backend if corrupt & accessed sitting around. I'm
also not a big fan of keeping what is essentially 'garbage' around in
the catalog which could just lead to confusion later if someone's
looking at "what actual temporary tables should there be" and seeing
others that they wouldn't expect.
Thanks,
Stephen
Stephen Frost wrote:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
The only solution proposed in that thread was to auto-delete temp
tables at postmaster restart; which I opposed on the grounds that
throwing away data right after a crash was a terrible idea from a
forensic standpoint.Why not just rename the files out of the way, and nuke the entries from
the catalog? Something like "filename.crash" or similar that an admin
can have scripts in place to check for and who could then go handle as
appropriate (remove, investigate, etc).
This was my thought too.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Stephen Frost <sfrost@snowman.net> writes:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
The only solution proposed in that thread was to auto-delete temp
tables at postmaster restart; which I opposed on the grounds that
throwing away data right after a crash was a terrible idea from a
forensic standpoint.
Why not just rename the files out of the way, and nuke the entries from
the catalog?
It's usually tough to make any sense of the contents of a table if you
don't have the catalog entries. Anyway, that approach would put the
onus on the admin to clean things up eventually, which isn't all that
appealing.
Bear in mind that temp table contents are subject to summary deletion
during normal operation anyway. What I opposed back in January was
deleting them *immediately* after a crash, but that doesn't mean I'm
in favor of keeping them indefinitely.
regards, tom lane
Tom Lane wrote:
Gregory Stark <stark@enterprisedb.com> writes:
Could autovacuum emit log messages as soon as it sees such tables and start
dropping them at some point later?We might have to rearrange the logic a bit to make that happen (I'm not
sure what order things get tested in), but a log message does seem like
a good idea. I'd go for logging anytime an orphaned table is seen,
and dropping once it's past the anti-wraparound horizon.
I don't think this requires much of a rearrangement -- see autovacuum.c
1921ff.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Heikki Linnakangas wrote:
Or we could have autovacuum just drop orphaned temp tables, *if*
they have gotten old enough to need anti-wraparound vacuuming.
While I'm still uncomfortable with having autovac drop anything,
at least this would avoid the worst cases of "gee I really needed
that data to investigate the crash". The main attractions of this
idea are avoiding the corrupt-index issue and not doing vacuuming
work that's 99.99% sure to be useless.That sounds a lot simpler and better to me.
Yeah, when I read the original this one struck me as almost a no-brainer
choice.
cheers
andrew
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
We might have to rearrange the logic a bit to make that happen (I'm not
sure what order things get tested in), but a log message does seem like
a good idea. I'd go for logging anytime an orphaned table is seen,
and dropping once it's past the anti-wraparound horizon.
I don't think this requires much of a rearrangement -- see autovacuum.c
1921ff.
So everyone is happy with the concept of doing it as above? If so,
I'll work on it this weekend sometime.
regards, tom lane
Tom Lane wrote:
We might have to rearrange the logic a bit to make that happen (I'm not
sure what order things get tested in), but a log message does seem like
a good idea. I'd go for logging anytime an orphaned table is seen,
and dropping once it's past the anti-wraparound horizon.
Is there an easy way for an Admin clean-up the lost temp tables that
autovacuum is complaining about? It seems like it could be along time
and a lot of log messages between when they are first orphaned and and
finally dropped due to anti-wraparound protection.
Tom Lane writes:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
We might have to rearrange the logic a bit to make that happen
(I'm not
sure what order things get tested in), but a log message does seem
like
a good idea. I'd go for logging anytime an orphaned table is seen,
and dropping once it's past the anti-wraparound horizon.I don't think this requires much of a rearrangement -- see
autovacuum.c
1921ff.So everyone is happy with the concept of doing it as above? If so,
I'll work on it this weekend sometime.
I think it is the most reasonable thing to do. Regarding the log
messages about orphaned tables, it would be nice if you could add a
hint/detail message explaining how to cleanup those tables. If that's
possible.
Best Regards
Michael Paesold
"Matthew T. O'Connor" <matthew@zeut.net> writes:
Is there an easy way for an Admin clean-up the lost temp tables that
autovacuum is complaining about? It seems like it could be along time
and a lot of log messages between when they are first orphaned and and
finally dropped due to anti-wraparound protection.
Drop the particular temp schema, maybe? The log message should probably
make sure to specify the schema name.
regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
We might have to rearrange the logic a bit to make that happen (I'm not
sure what order things get tested in), but a log message does seem like
a good idea. I'd go for logging anytime an orphaned table is seen,
and dropping once it's past the anti-wraparound horizon.
I don't think this requires much of a rearrangement -- see autovacuum.c
1921ff.
Hmm, maybe I'm missing something but I see no good way to do it without
refactoring relation_check_autovac. Since that function is only called
in one place, I'm thinking of just inlining it; do you see a reason
not to?
regards, tom lane
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
We might have to rearrange the logic a bit to make that happen (I'm not
sure what order things get tested in), but a log message does seem like
a good idea. I'd go for logging anytime an orphaned table is seen,
and dropping once it's past the anti-wraparound horizon.I don't think this requires much of a rearrangement -- see autovacuum.c
1921ff.Hmm, maybe I'm missing something but I see no good way to do it without
refactoring relation_check_autovac.
Hmm, oops :-)
Since that function is only called in one place, I'm thinking of just
inlining it; do you see a reason not to?
Nope, go ahead.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
On Fri, 2008-06-27 at 12:43 -0400, Tom Lane wrote:
Stephen Frost <sfrost@snowman.net> writes:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
The only solution proposed in that thread was to auto-delete temp
tables at postmaster restart; which I opposed on the grounds that
throwing away data right after a crash was a terrible idea from a
forensic standpoint.Why not just rename the files out of the way, and nuke the entries from
the catalog?It's usually tough to make any sense of the contents of a table if you
don't have the catalog entries. Anyway, that approach would put the
onus on the admin to clean things up eventually, which isn't all that
appealing.
We need to learn to live without the catalog entries anyway.
In Hot Standby mode, creating a temp table would cause a write to the
catalog, which would then fail. If we want to allow creating temp tables
in Hot Standby mode then we must prevent them from inserting and then
later deleting them from the catalog.
So it would seem that we need a way of handling temp tables that doesn't
rely on catalog entries at all.
My proposal would be that we identify all temp table names by the xid
that first creates them (and their name...). In memory we would keep
track of which files go with which tablenames, so we can still locate
them easily. If we lose that context we can always see which files are
in need of vacuum using that xid.
Are we also at risk from very long lived sessions that use temp tables?
Do we need to put in a check for sessions to vacuum old temp tables that
are still valid?
I'm happy to work on this if a solution to all of the above emerges.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
Simon Riggs <simon@2ndquadrant.com> writes:
So it would seem that we need a way of handling temp tables that doesn't
rely on catalog entries at all.
That's a complete non-starter; I need go no farther than to point out
that it would break clients that expect to see their temp tables
reflected in pg_class and so forth.
There's been some idle musing in the past about causing pg_class and
friends to have inheritance-child tables that are "temp", both in the
sense of being backend-local and of not having guaranteed storage,
and arranging to store the rows needed for a backend's temp tables
in there. There's still a long way to go to make that happen, but
at least it would be reasonably transparent on the client side.
Are we also at risk from very long lived sessions that use temp tables?
Yeah, one of the bigger problems with it is that no one else could
even see whether a backend's temp table was at risk of wraparound
(or even before actual wraparound, in need of freezing because pg_clog
is supposed to be truncated). Possibly a backend could advertise a
min frozen xid for its temp tables in the PGPROC array.
regards, tom lane
On Fri, 2008-07-11 at 17:26 -0400, Tom Lane wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
So it would seem that we need a way of handling temp tables that doesn't
rely on catalog entries at all.That's a complete non-starter; I need go no farther than to point out
that it would break clients that expect to see their temp tables
reflected in pg_class and so forth.
What does the SQL Standard say about the Information Schema I wonder/
There's been some idle musing in the past about causing pg_class and
friends to have inheritance-child tables that are "temp", both in the
sense of being backend-local and of not having guaranteed storage,
and arranging to store the rows needed for a backend's temp tables
in there. There's still a long way to go to make that happen, but
at least it would be reasonably transparent on the client side.
OK, very cool if we could make it work. I realised it would have to
apply all the way through to pg_statistic.
Brain dump a little more, while we're on the subject? This aspect is
something I've not spent any time on yet, so even a few rungs up the
ladder will help lots. Thanks.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
On Sat, Jul 12, 2008 at 12:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Fri, 2008-07-11 at 17:26 -0400, Tom Lane wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
So it would seem that we need a way of handling temp tables that doesn't
rely on catalog entries at all.That's a complete non-starter; I need go no farther than to point out
that it would break clients that expect to see their temp tables
reflected in pg_class and so forth.What does the SQL Standard say about the Information Schema I wonder/
Many apps were written long before we had one. Not too mention that it
doesn't provide anything like all the info that PostgreSQL-specific
tool (though not necessarily user apps) would likely need.
--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
On Sat, 2008-07-12 at 00:57 +0100, Dave Page wrote:
On Sat, Jul 12, 2008 at 12:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Fri, 2008-07-11 at 17:26 -0400, Tom Lane wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
So it would seem that we need a way of handling temp tables that doesn't
rely on catalog entries at all.That's a complete non-starter; I need go no farther than to point out
that it would break clients that expect to see their temp tables
reflected in pg_class and so forth.What does the SQL Standard say about the Information Schema I wonder/
Many apps were written long before we had one. Not too mention that it
doesn't provide anything like all the info that PostgreSQL-specific
tool (though not necessarily user apps) would likely need.
So are you saying
a) that other sessions need to be able to see pg_class entries for
temporary tables created by different sessions?
b) that you need to be able to see pg_class entries for temporary tables
created only in your session?
Hopefully you just mean (b)??
a) would simply not be possible at the same time as having us avoid
pg_class writes in Hot Standby mode. We would have a choice of seeing
temp tables, or allowing temp tables to work in Hot Standby, not both.
b) would is possible, if we follow the route of taking a locally
inherited copy of pg_class.
The SQL Standard Information Schema does show LOCAL TEMPORARY and GLOBAL
TEMPORARY tables. Our implementation of temp tables differs from the
standard, so I think (b) is fully in line with that.
If anybody did want (a), then I'd suggest that we use the keyword GLOBAL
TEMPORARY table to denote something that would be put in pg_class and
visible to all, and thus unusable in hot standby mode. I'm not planning
on building that.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support