Temporary tables versus wraparound... again

Started by Bruce Momjianover 5 years ago55 messageshackers
Jump to latest
#1Bruce Momjian
bruce@momjian.us

We had an outage caused by transaction wraparound. And yes, one of the
first things I did on this site was check that we didn't have any
databases that were in danger of wraparound.

However since then we added a monitoring job that used a temporary
table with ON COMMIT DELETE ROWS. Since it was a simple monitoring job
it stayed connected to the database and used this small temporary
table for a very long period of time.

The temporary table never got vacuumed by autovacuum and never by the
monitoring job (since it was being truncated on every transaction why
would it need to be vacuumed...).

We've been around this bush before. Tom added orphaned table
protection to autovacuum precisely because temporary tables can cause
the datfrozenxid to get held back indefinitely. Then Michael Paquier
and Tsunakawa Takayuki both found it worth making this more
aggressive.

But none of that helped as the temporary schema was still in use so
they were not considered "orphaned" temp tables at all.

I think we need to add some warnings to autovacuum when it detects
*non* orphaned temporary tables that are older than the freeze
threshold.

However in the case of ON COMMIT DELETE ROWS we can do better. Why not
just reset the relfrozenxid and other stats as if the table was
freshly created when it's truncated?

I put together this quick patch to check the idea and it seems to
integrate fine in the code. I'm not sure about a few points but I
don't think they're showstoppers.

1) Should we update relpages and reltuples. I think so but an argument
could be made that people might be depending on running analyze once
when the data is loaded and then not having to run analyze on every
data load.

2) adding the dependency on heapam.h to heap.c makes sense because of
heap_inplace_update bt it may be a bit annoying because I suspect
that's a useful sanity check that the tableam stuff hasn't been
bypassed

3) I added a check to the regression tests but I'm not sure it's a
good idea to actually commit this. It could fail if there's a parallel
transaction going on and even moving the test to the serial schedule
might not guarantee that never happens due to autovacuum running
analyze?

I didn't actually add the warning to autovacuum yet.

--
greg

Attachments:

0001-update-relfrozenxmin-when-truncating-temp-tables.patchtext/x-patch; charset=US-ASCII; name=0001-update-relfrozenxmin-when-truncating-temp-tables.patchDownload+91-4
#2Noah Misch
noah@leadboat.com
In reply to: Bruce Momjian (#1)
Re: Temporary tables versus wraparound... again

On Sun, Nov 08, 2020 at 06:19:57PM -0500, Greg Stark wrote:

However in the case of ON COMMIT DELETE ROWS we can do better. Why not
just reset the relfrozenxid and other stats as if the table was
freshly created when it's truncated?

That concept is sound.

1) Should we update relpages and reltuples. I think so but an argument
could be made that people might be depending on running analyze once
when the data is loaded and then not having to run analyze on every
data load.

I'd wager no, we should not. An app that ever analyzes an ON COMMIT DELETE
ROWS temp table probably analyzes it every time. If not, it's fair to guess
that similar statistics recur in each xact.

2) adding the dependency on heapam.h to heap.c makes sense because of
heap_inplace_update bt it may be a bit annoying because I suspect
that's a useful sanity check that the tableam stuff hasn't been
bypassed

That is not terrible. How plausible would it be to call vac_update_relstats()
for this, instead of reimplementing part of it?

3) I added a check to the regression tests but I'm not sure it's a
good idea to actually commit this. It could fail if there's a parallel
transaction going on and even moving the test to the serial schedule
might not guarantee that never happens due to autovacuum running
analyze?

Right.

@@ -3340,6 +3383,7 @@ heap_truncate_one_rel(Relation rel)

/* Truncate the underlying relation */
table_relation_nontransactional_truncate(rel);
+ ResetVacStats(rel);

I didn't test, but I expect this will cause a stats reset for the second
TRUNCATE here:

CREATE TABLE t ();
...
BEGIN;
TRUNCATE t;
TRUNCATE t; -- inplace relfrozenxid reset
ROLLBACK; -- inplace reset survives

Does that indeed happen?

#3Bruce Momjian
bruce@momjian.us
In reply to: Noah Misch (#2)
Re: Temporary tables versus wraparound... again

On Mon, 9 Nov 2020 at 00:17, Noah Misch <noah@leadboat.com> wrote:

2) adding the dependency on heapam.h to heap.c makes sense because of
heap_inplace_update bt it may be a bit annoying because I suspect
that's a useful sanity check that the tableam stuff hasn't been
bypassed

That is not terrible. How plausible would it be to call vac_update_relstats()
for this, instead of reimplementing part of it?

It didn't seem worth it to change its API to add boolean flags to skip
setting some of the variables (I was originally only doing
relfrozenxid and minmmxid). Now that I'm doing most of the variables
maybe it makes a bit more sense.

@@ -3340,6 +3383,7 @@ heap_truncate_one_rel(Relation rel)

/* Truncate the underlying relation */
table_relation_nontransactional_truncate(rel);
+ ResetVacStats(rel);

I didn't test, but I expect this will cause a stats reset for the second
TRUNCATE here:

CREATE TABLE t ();
...
BEGIN;
TRUNCATE t;
TRUNCATE t; -- inplace relfrozenxid reset
ROLLBACK; -- inplace reset survives

Does that indeed happen?

Apparently no, see below. I have to say I was pretty puzzled by the
actual behaviour which is that the rollback actually does roll back
the inplace update. But I *think* what is happening is that the first
truncate does an MVCC update so the inplace update happens only to the
newly created tuple which is never commited.

Thinking about things a bit this does worry me a bit. I wonder if
inplace update is really safe outside of vacuum where we know we're
not in a transaction that can be rolled back. But IIRC doing a
non-inplace update on pg_class for these columns breaks other things.
I don't know if that's still true.

Also, in checking this question I realized I had missed 3d351d91. I
should be initializing reltuples to -1 not 0.

postgres=# vacuum t;
VACUUM
postgres=# select
relname,relpages,reltuples,relallvisible,relfrozenxid from pg_class
where oid='t'::regclass;
relname | relpages | reltuples | relallvisible | relfrozenxid
---------+----------+-----------+---------------+--------------
t | 9 | 2000 | 9 | 15557
(1 row)

postgres=# begin;
BEGIN
postgres=*# truncate t;
TRUNCATE TABLE
postgres=*# truncate t;
TRUNCATE TABLE
postgres=*# select
relname,relpages,reltuples,relallvisible,relfrozenxid from pg_class
where oid='t'::regclass;
relname | relpages | reltuples | relallvisible | relfrozenxid
---------+----------+-----------+---------------+--------------
t | 0 | 0 | 0 | 15562
(1 row)

postgres=*# abort;
ROLLBACK
postgres=# select
relname,relpages,reltuples,relallvisible,relfrozenxid from pg_class
where oid='t'::regclass;
relname | relpages | reltuples | relallvisible | relfrozenxid
---------+----------+-----------+---------------+--------------
t | 9 | 2000 | 9 | 15557
(1 row)

--
greg

#4Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bruce Momjian (#3)
Re: Temporary tables versus wraparound... again

On Mon, Nov 9, 2020 at 3:23 PM Greg Stark <stark@mit.edu> wrote:

On Mon, 9 Nov 2020 at 00:17, Noah Misch <noah@leadboat.com> wrote:

2) adding the dependency on heapam.h to heap.c makes sense because of
heap_inplace_update bt it may be a bit annoying because I suspect
that's a useful sanity check that the tableam stuff hasn't been
bypassed

That is not terrible. How plausible would it be to call vac_update_relstats()
for this, instead of reimplementing part of it?

It didn't seem worth it to change its API to add boolean flags to skip
setting some of the variables (I was originally only doing
relfrozenxid and minmmxid). Now that I'm doing most of the variables
maybe it makes a bit more sense.

@@ -3340,6 +3383,7 @@ heap_truncate_one_rel(Relation rel)

/* Truncate the underlying relation */
table_relation_nontransactional_truncate(rel);
+ ResetVacStats(rel);

I didn't test, but I expect this will cause a stats reset for the second
TRUNCATE here:

CREATE TABLE t ();
...
BEGIN;
TRUNCATE t;
TRUNCATE t; -- inplace relfrozenxid reset
ROLLBACK; -- inplace reset survives

Does that indeed happen?

Apparently no, see below. I have to say I was pretty puzzled by the
actual behaviour which is that the rollback actually does roll back
the inplace update. But I *think* what is happening is that the first
truncate does an MVCC update so the inplace update happens only to the
newly created tuple which is never commited.

I think in-plase update that the patch introduces is not used because
TRUNCATE doesn't use heap_truncate_one_rel() to truncate a table in
that scenario. It does MVCC update the pg_class tuple for a new
relfilenode with new relfrozenxid and other stats, see
RelationSetNewRelfilenode(). If we create and truncate a table within
the transaction it does in-place update that the patch introduces but
I think it's no problem in this case either.

Thinking about things a bit this does worry me a bit. I wonder if
inplace update is really safe outside of vacuum where we know we're
not in a transaction that can be rolled back. But IIRC doing a
non-inplace update on pg_class for these columns breaks other things.
I don't know if that's still true.

heap_truncate_one_rel() is not a transaction-safe operation. Doing
in-place updates during that operation seems okay to me unless I'm
missing something.

Regards,

--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/

#5Noah Misch
noah@leadboat.com
In reply to: Masahiko Sawada (#4)
Re: Temporary tables versus wraparound... again

On Tue, Nov 10, 2020 at 04:10:57PM +0900, Masahiko Sawada wrote:

On Mon, Nov 9, 2020 at 3:23 PM Greg Stark <stark@mit.edu> wrote:

On Mon, 9 Nov 2020 at 00:17, Noah Misch <noah@leadboat.com> wrote:

@@ -3340,6 +3383,7 @@ heap_truncate_one_rel(Relation rel)

/* Truncate the underlying relation */
table_relation_nontransactional_truncate(rel);
+ ResetVacStats(rel);

I didn't test, but I expect this will cause a stats reset for the second
TRUNCATE here:

CREATE TABLE t ();
...
BEGIN;
TRUNCATE t;
TRUNCATE t; -- inplace relfrozenxid reset
ROLLBACK; -- inplace reset survives

Does that indeed happen?

Apparently no, see below. I have to say I was pretty puzzled by the
actual behaviour which is that the rollback actually does roll back
the inplace update. But I *think* what is happening is that the first
truncate does an MVCC update so the inplace update happens only to the
newly created tuple which is never commited.

I think in-plase update that the patch introduces is not used because
TRUNCATE doesn't use heap_truncate_one_rel() to truncate a table in
that scenario. It does MVCC update the pg_class tuple for a new
relfilenode with new relfrozenxid and other stats, see
RelationSetNewRelfilenode(). If we create and truncate a table within
the transaction it does in-place update that the patch introduces but
I think it's no problem in this case either.

Agreed. Rolling back a heap_truncate_one_rel() always implies rolling back to
an earlier version of the entire pg_class tuple. (That may not be true of
mapped relations, but truncating them is unreasonable.) Thanks for checking.

Thinking about things a bit this does worry me a bit. I wonder if
inplace update is really safe outside of vacuum where we know we're
not in a transaction that can be rolled back. But IIRC doing a
non-inplace update on pg_class for these columns breaks other things.
I don't know if that's still true.

heap_truncate_one_rel() is not a transaction-safe operation. Doing
in-place updates during that operation seems okay to me unless I'm
missing something.

Yep.

#6Bruce Momjian
bruce@momjian.us
In reply to: Noah Misch (#5)
Re: Temporary tables versus wraparound... again

Here's an updated patch. I added some warning messages to autovacuum.

One thing I learned trying to debug this situation in production is
that it's nigh impossible to find the pid of the session using a
temporary schema. The number in the schema refers to the backendId in
the sinval stuff for which there's no external way to look up the
corresponding pid. It would have been very helpful if autovacuum had
just told me which backend pid to kill.

I still have the regression test in the patch and as before I think
it's probably not worth committing. I'm leaning to reverting that
section of the patch before comitting.

Incidentally there's still a hole here where a new session can attach
to an existing temporary schema where a table was never truncated due
to a session dieing abnormally. That new session could be a long-lived
session but never use the temporary schema causing the old table to
just sit there. Autovacuum has no way to tell it's not actually in
use. I tend to think the optimization to defer cleaning the temporary
schema until it's used might not really be an optimization. It still
needs to be cleaned someday so it's just moving the work around. Just
removing that optimization might be the easiest way to close this
hole. The only alternative I see is adding a flag to PROC or somewhere
where autovacuum can see if the backend has actually initialized the
temporary schema yet or not.

Attachments:

v2-0001-Update-relfrozenxmin-when-truncating-temp-tables.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Update-relfrozenxmin-when-truncating-temp-tables.patchDownload+151-23
#7Nathan Bossart
nathandbossart@gmail.com
In reply to: Bruce Momjian (#6)
Re: Temporary tables versus wraparound... again

On 10/12/21, 3:07 PM, "Greg Stark" <stark@mit.edu> wrote:

Here's an updated patch. I added some warning messages to autovacuum.

I think this is useful optimization, and I intend to review the patch
more closely soon. It looks reasonable to me after a quick glance.

One thing I learned trying to debug this situation in production is
that it's nigh impossible to find the pid of the session using a
temporary schema. The number in the schema refers to the backendId in
the sinval stuff for which there's no external way to look up the
corresponding pid. It would have been very helpful if autovacuum had
just told me which backend pid to kill.

I certainly think it would be good to have autovacuum log the PID, but
IIUC a query like this would help you map the backend ID to the PID:

SELECT bid, pg_stat_get_backend_pid(bid) AS pid FROM pg_stat_get_backend_idset() bid;

Nathan

#8Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#6)
Re: Temporary tables versus wraparound... again

Hi,

On 2021-10-12 18:04:35 -0400, Greg Stark wrote:

Here's an updated patch.

Unfortunately it doesn't apply anymore these days: http://cfbot.cputube.org/patch_37_3358.log

Marked as waiting-on-author.

Greetings,

Andres Freund

#9Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#8)
Re: Temporary tables versus wraparound... again

No problem, I can update the patch and check on the fuzz.

But the actual conflict is just in the test and I'm not sure it's
really worth having a test at all. It's testing a pretty low level
detail. So I'm leaning toward fixing the conflict by just ripping the
test out.

Nathan also pointed out there was a simpler way to get the pid. I
don't think the way I was doing it was wrong but I'll double check
that.

#10Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#9)
Re: Temporary tables versus wraparound... again

Here's a rebased patch. I split the test into a separate patch that I
would lean to dropping. But at least it applies now.

I did look into pg_stat_get_backend_pid() and I guess it would work
but going through the stats mechanism does seem like going the long
way around since we're already looking at the backendId info directly
here, we just weren't grabbing the pid.

I did make a small change, I renamed the checkTempNamespaceStatus()
function to checkTempNamespaceStatusAndPid(). It seems unlikely there
are any external consumers of this function (the only internal
consumer is autovacuum.c). But just in case I renamed it to protect
against any external modules failing from the added parameter.

Attachments:

v3-0002-Add-test-for-truncating-temp-tables-advancing-rel.patchtext/x-patch; charset=US-ASCII; name=v3-0002-Add-test-for-truncating-temp-tables-advancing-rel.patchDownload+47-4
v3-0001-Update-relfrozenxmin-when-truncating-temp-tables.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Update-relfrozenxmin-when-truncating-temp-tables.patchDownload+106-21
#11Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#10)
Re: Temporary tables versus wraparound... again

I had to rebase this again after Tom's cleanup of heap.c removing some includes.

I had to re-add snapmgr to access RecentXmin. I occurs to me to ask
whether RecentXmin is actually guaranteed to be set. I haven't
checked. I thought it was set when the first snapshot was taken and
presumably even if it's a non-transactional truncate we're still in a
transaction?

The patch also added heapam.h to heap.c which might seem like a layer
violation. I think it's ok since it's just to be able to update the
catalog (heap_inplace_update is in heapam.h).

Attachments:

v4-0002-Add-test-for-truncating-temp-tables-advancing-rel.patchtext/x-patch; charset=US-ASCII; name=v4-0002-Add-test-for-truncating-temp-tables-advancing-rel.patchDownload+47-4
v4-0001-Update-relfrozenxmin-when-truncating-temp-tables.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Update-relfrozenxmin-when-truncating-temp-tables.patchDownload+107-21
#12Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#11)
Re: Temporary tables versus wraparound... again

Hi,

On 2022-03-28 16:11:55 -0400, Greg Stark wrote:

From 4515075b644d1e38920eb5bdaaa898e1698510a8 Mon Sep 17 00:00:00 2001
From: Greg Stark <stark@mit.edu>
Date: Tue, 22 Mar 2022 15:51:32 -0400
Subject: [PATCH v4 1/2] Update relfrozenxmin when truncating temp tables

Make ON COMMIT DELETE ROWS reset relfrozenxmin and other table stats
like normal truncate. Otherwise even typical short-lived transactions
using temporary tables can easily cause them to reach relfrozenxid.

Might be worth mentioning that ON COMMIT DELETE is implemented as truncating
tables. If we actually implemented it as deleting rows, it'd not at all be
correct to reset relfrozenxmin.

Also add warnings when old temporary tables are found to still be in
use during autovacuum. Long lived sessions using temporary tables are
required to vacuum them themselves.

I'd do that in a separate patch.

+/*
+ * Reset the relfrozenxid and other stats to the same values used when
+ * creating tables. This is used after non-transactional truncation.
+ *
+ * This reduces the need for long-running programs to vacuum their own
+ * temporary tables (since they're not covered by autovacuum) at least in the
+ * case where they're ON COMMIT DELETE ROWS.
+ *
+ * see also src/backend/commands/vacuum.c vac_update_relstats()
+ * also see AddNewRelationTuple() above
+ */
+
+static void
+ResetVacStats(Relation rel)
+{
+	HeapTuple	ctup;
+	Form_pg_class pgcform;
+	Relation classRel;
+
+	/* Fetch a copy of the tuple to scribble on */
+	classRel = table_open(RelationRelationId, RowExclusiveLock);
+	ctup = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(RelationGetRelid(rel)));
+	if (!HeapTupleIsValid(ctup))
+		elog(ERROR, "pg_class entry for relid %u vanished during truncation",
+			 RelationGetRelid(rel));
+	pgcform = (Form_pg_class) GETSTRUCT(ctup);
+
+	/*
+	 * Update relfrozenxid
+	 */
+
+	pgcform->relpages = 0;
+	pgcform->reltuples = -1;
+	pgcform->relallvisible = 0;
+	pgcform->relfrozenxid = RecentXmin;

Hm. Is RecentXmin guaranteed to be valid at this point?

+ pgcform->relminmxid = GetOldestMultiXactId();

Ugh. That's pretty expensive for something now done at a much higher rate than
before.

@@ -2113,20 +2126,31 @@ do_autovacuum(void)
* Remember it so we can try to delete it later.
*/
orphan_oids = lappend_oid(orphan_oids, relid);
+			} else if (temp_status == TEMP_NAMESPACE_NOT_TEMP) {
+				elog(LOG, "autovacuum: found temporary table \"%s.%s.%s\" in non-temporary namespace",
+					 get_database_name(MyDatabaseId),
+					 get_namespace_name(classForm->relnamespace),
+					 NameStr(classForm->relname));
+			} else if (temp_status == TEMP_NAMESPACE_IN_USE && wraparound) {

we put else if on a separate line from }. And { also is always on a separate
line.

Greetings,

Andres Freund

#13Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#12)
Re: Temporary tables versus wraparound... again

On Mon, 28 Mar 2022 at 16:30, Andres Freund <andres@anarazel.de> wrote:

Make ON COMMIT DELETE ROWS reset relfrozenxmin and other table stats
like normal truncate. Otherwise even typical short-lived transactions
using temporary tables can easily cause them to reach relfrozenxid.

Might be worth mentioning that ON COMMIT DELETE is implemented as truncating
tables. If we actually implemented it as deleting rows, it'd not at all be
correct to reset relfrozenxmin.

In the commit message or are you saying this needs documentation or a comment?

Also add warnings when old temporary tables are found to still be in
use during autovacuum. Long lived sessions using temporary tables are
required to vacuum them themselves.

I'd do that in a separate patch.

Hm, seems a bit small but sure no problem, I'll split it out.

+ pgcform->relfrozenxid = RecentXmin;

Hm. Is RecentXmin guaranteed to be valid at this point?

I mentioned the same worry. But ok, I just looked into it and it's
definitely not a problem. We only do truncates after either a user
issued TRUNCATE when the table was created in the same transaction or
at commit iff a flag is set indicating temporary tables have been
used. Either way a snapshot has been taken. I've added some comments
and an assertion and I think if assertions are disabled and this
impossible condition is hit we can just skip the stats reset.

+ pgcform->relminmxid = GetOldestMultiXactId();

Ugh. That's pretty expensive for something now done at a much higher rate than
before.

This I'm really not sure about. I really don't know much about
multixacts. I've been reading a bit but I'm not sure what to do yet.
I'm wondering if there's something cheaper we can use. We don't need
the oldest mxid that might be visible in a table somewhere, just the
oldest that has a real live uncommitted transaction in it that could
yet create new tuples in the truncated table.

In the case of temporary tables I think we could just set it to the
next mxid since there are no live transactions capable of inserting
into the temporary table. But in the case of a table created in this
transaction then that wouldn't be good enough. I think? I'm not clear
whether existing mxids get reused for new updates if they happen to
have the same set of locks in them as some existing mxid.

we put else if on a separate line from }. And { also is always on a separate
line.

Sorry, old habits...

Incidentally.... in doing the above I noticed an actual bug :( The
toast reset had the wrong relid in it. I'll add the toast table to the
test too.

--
greg

#14David G. Johnston
david.g.johnston@gmail.com
In reply to: Bruce Momjian (#13)
Re: Temporary tables versus wraparound... again

On Tue, Mar 29, 2022 at 4:52 PM Greg Stark <stark@mit.edu> wrote:

On Mon, 28 Mar 2022 at 16:30, Andres Freund <andres@anarazel.de> wrote:

Make ON COMMIT DELETE ROWS reset relfrozenxmin and other table

stats

like normal truncate. Otherwise even typical short-lived

transactions

using temporary tables can easily cause them to reach relfrozenxid.

Might be worth mentioning that ON COMMIT DELETE is implemented as

truncating

tables. If we actually implemented it as deleting rows, it'd not at all

be

correct to reset relfrozenxmin.

In the commit message or are you saying this needs documentation or a
comment?

Just flying by here but...

The user-facing documentation already covers this:

https://www.postgresql.org/docs/current/sql-createtable.html

"All rows in the temporary table will be deleted at the end of each
transaction block. Essentially, an automatic TRUNCATE is done at each
commit. When used on a partitioned table, this is not cascaded to its
partitions."

I'm not sure why we felt the need to add "essentially" here - but maybe
it's because we didn't "reset relfronzedenxmin and other table stats like
normal truncate."? Or maybe just natural word flow.

Either way, maybe word it like this to avoid the need for essentially
altogether:

The temporary table will be automatically truncated at the end of each
transaction block. However, unlike the TRUNCATE command, descendent tables
will not be cascaded to. (I'm changing partitions to descendant tables to
make a point here - the TRUNCATE command only references descendent tables,
not mentioning partitioning by name at all. Is this desirable?)

I don't have any substantive insight into the commit message or code
comments; but it doesn't seem immediately wrong to assume the reader
understands that ON COMMIT DELETE ROWS uses something more akin to TRUNCATE
rather than DELETE since that is what the feature is documented to do. The
commit message in particular seems like it doesn't need to teach that
point; but can do so if it makes understanding the changes easier.

David J.

#15Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#13)
Re: Temporary tables versus wraparound... again

Hi,

On 2022-03-29 19:51:26 -0400, Greg Stark wrote:

On Mon, 28 Mar 2022 at 16:30, Andres Freund <andres@anarazel.de> wrote:

Make ON COMMIT DELETE ROWS reset relfrozenxmin and other table stats
like normal truncate. Otherwise even typical short-lived transactions
using temporary tables can easily cause them to reach relfrozenxid.

Might be worth mentioning that ON COMMIT DELETE is implemented as truncating
tables. If we actually implemented it as deleting rows, it'd not at all be
correct to reset relfrozenxmin.

In the commit message or are you saying this needs documentation or a comment?

In the commit message.

+ pgcform->relminmxid = GetOldestMultiXactId();

Ugh. That's pretty expensive for something now done at a much higher rate than
before.

This I'm really not sure about. I really don't know much about
multixacts. I've been reading a bit but I'm not sure what to do yet.
I'm wondering if there's something cheaper we can use. We don't need
the oldest mxid that might be visible in a table somewhere, just the
oldest that has a real live uncommitted transaction in it that could
yet create new tuples in the truncated table.

In the case of temporary tables I think we could just set it to the
next mxid since there are no live transactions capable of inserting
into the temporary table. But in the case of a table created in this
transaction then that wouldn't be good enough. I think? I'm not clear
whether existing mxids get reused for new updates if they happen to
have the same set of locks in them as some existing mxid.

Yes, that can happen. But of course the current xid is always part of the
multixact, so it can't be a multixact from before the transaction started.

There's already a record of the oldest mxid a backend considers live, computed
on the first use of multixacts in a transaction. See
MultiXactIdSetOldestVisible(). Which I think might serve as a suitable
relminmxid of a temporary table in an already running transaction?

Greetings,

Andres Freund

#16Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#15)
Re: Temporary tables versus wraparound... again

I've updated the patches.

Adding the assertion actually turned up a corner case where RecentXmin
was *not* set. If you lock a temporary table and that's the only thing
you do in a transaction then the flag is set indicating you've used
the temp schema but you never take a snapshot :(

I also split out the warnings and added a test that relfrozenxid was
advanced on the toast table as well.

I haven't wrapped my head around multixacts yet. It's complicated by
this same codepath being used for truncates of regular tables that
were created in the same transaction.

Attachments:

v5-0002-Update-relfrozenxmin-when-truncating-temp-tables.patchtext/x-patch; charset=US-ASCII; name=v5-0002-Update-relfrozenxmin-when-truncating-temp-tables.patchDownload+60-1
v5-0003-Add-test-for-truncating-temp-tables-advancing-rel.patchtext/x-patch; charset=US-ASCII; name=v5-0003-Add-test-for-truncating-temp-tables-advancing-rel.patchDownload+74-4
v5-0001-Add-warnings-when-old-temporary-tables-are-found-.patchtext/x-patch; charset=US-ASCII; name=v5-0001-Add-warnings-when-old-temporary-tables-are-found-.patchDownload+57-21
#17Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#16)
Re: Temporary tables versus wraparound... again

On Thu, 31 Mar 2022 at 16:05, Greg Stark <stark@mit.edu> wrote:

I haven't wrapped my head around multixacts yet. It's complicated by
this same codepath being used for truncates of regular tables that
were created in the same transaction.

So my best idea so far is to actually special-case the temp table case
in this code path. I think that's easy enough since I have the heap
tuple I'm about to replace.

In the temp table case I would just use the value Andres proposes.

In the "truncating table in same transaction it was created" case then
I would go ahead and use the expensive GetOldestMultiXactId() which
should be ok for that case. At least I think the "much higher rate"
comment was motivated by the idea that every transaction commit (when
temp tables are being used) is more frequent than any specific user
ddl.

It's not brilliant since it seems to be embedding knowledge of the
cases where this optimization applies in a lower level function. If we
think of some other case where it could apply it wouldn't be obvious
that it will have a cost to it. But it doesn't seem too terrible to
me.

An alternative would be to simply not adjust relminmxid for non-temp
tables at all. I guess that's not too bad either since these are
non-temp tables that autovacuum will be able to do anti-wraparound
vacuums on. And I get the impression mxids don't wraparound nearly as
often as xids?

--
greg

#18Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#17)
Re: Temporary tables versus wraparound... again

So here's an updated patch.

I had to add a public method to multixact.c to expose the locally
calculated OldestMultiXactId. It's possible we could use something
even tighter (like the current next mxid since we're about to commit)
but I didn't see a point in going further and it would have become
more complex.

I also added a branch in heapam_handler.c in ..._set_new_filenode() of
temporary tables. It feels like a free win and it's more consistent.

I'm not 100% on the tableam abstraction -- it's possible all of this
change should have happened in heapam_handler somewhere? I don't think
so but it does feel weird to be touching it and also doing the same
thing elsewhere.

I think this has addressed all the questions now.

Attachments:

v6-0001-Add-warnings-when-old-temporary-tables-are-found-.patchtext/x-patch; charset=US-ASCII; name=v6-0001-Add-warnings-when-old-temporary-tables-are-found-.patchDownload+57-21
v6-0003-Add-test-for-truncating-temp-tables-advancing-rel.patchtext/x-patch; charset=US-ASCII; name=v6-0003-Add-test-for-truncating-temp-tables-advancing-rel.patchDownload+74-4
v6-0002-Update-relfrozenxmin-when-truncating-temp-tables.patchtext/x-patch; charset=US-ASCII; name=v6-0002-Update-relfrozenxmin-when-truncating-temp-tables.patchDownload+99-3
#19Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#18)
Re: Temporary tables versus wraparound... again

Simple Rebase

Attachments:

v7-0003-Add-test-for-truncating-temp-tables-advancing-rel.patchtext/x-patch; charset=US-ASCII; name=v7-0003-Add-test-for-truncating-temp-tables-advancing-rel.patchDownload+74-4
v7-0001-Add-warnings-when-old-temporary-tables-are-found-.patchtext/x-patch; charset=US-ASCII; name=v7-0001-Add-warnings-when-old-temporary-tables-are-found-.patchDownload+57-21
v7-0002-Update-relfrozenxmin-when-truncating-temp-tables.patchtext/x-patch; charset=US-ASCII; name=v7-0002-Update-relfrozenxmin-when-truncating-temp-tables.patchDownload+99-3
#20Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#1)
Re: Temporary tables versus wraparound... again

On Sun, 8 Nov 2020 at 18:19, Greg Stark <stark@mit.edu> wrote:

We had an outage caused by transaction wraparound. And yes, one of the
first things I did on this site was check that we didn't have any
databases that were in danger of wraparound.

Fwiw this patch has been in "Ready for Committer" state since April
and has been moved forward three times including missing the release.
It's a pretty short patch and fixes a problem that caused an outage
for $previous_employer and I've had private discussions from other
people who have been struggling with the same issue. Personally I
consider it pretty close to a bug fix and worth backpatching. I think
it's pretty annoying to have put out a release without this fix.

--
greg

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#19)
#22Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#21)
#23Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#22)
#24Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#23)
#25Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#23)
#26Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#25)
#27Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#26)
#28Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#27)
#29Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#23)
#30Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#29)
#31Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#21)
#32Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#31)
#33Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#32)
#34Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#33)
#35Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bruce Momjian (#30)
#36Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#35)
#37Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#36)
#38Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#37)
#39Justin Pryzby
pryzby@telsasoft.com
In reply to: Bruce Momjian (#36)
#40Bruce Momjian
bruce@momjian.us
In reply to: Justin Pryzby (#39)
#41Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#40)
#42Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#41)
#43Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#42)
#44Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#43)
#45Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#44)
#46Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#44)
#47Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#46)
In reply to: Robert Haas (#47)
#49Bruce Momjian
bruce@momjian.us
In reply to: Peter Geoghegan (#48)
In reply to: Bruce Momjian (#49)
#51Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#49)
In reply to: Andres Freund (#51)
#53Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#44)
#54Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#53)
#55Peter Smith
smithpb2250@gmail.com
In reply to: Bruce Momjian (#54)