Performance issue on temporary relations

Started by 章晨曦5 months ago10 messages
#1章晨曦
zhangchenxi@halodbtech.com
1 attachment(s)

Hi there!

Recently I noticed a performance issue on temporary relation. The issue will happened on
ON COMMIT DELETE temporary relations. If one session only create a few temporary relations,
well, it's fine. But if one session creates plenty of ON COMMIT DELETE kind temporary relations,
say 3,000, it will face a significant performance degradation issue. Check below:

One temporary relation
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

psql (19devel)
Type "help" for help.

postgres=# CREATE LOCAL TEMP TABLE a_lttk1(n INT) ON COMMIT DELETE ROWS;
CREATE TABLE
postgres=# \timing
Timing is on.
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 3.004 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 3.884 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 4.041 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 3.827 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 3.783 ms

3,000 temporary relation
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

psql (19devel)
Type "help" for help.

postgres=# DO $$
postgres$# DECLARE
postgres$# v_sql VARCHAR(100);
postgres$# BEGIN
postgres$# FOR i IN 1..3000 LOOP
postgres$# v_sql := 'CREATE LOCAL TEMP TABLE a_lttk'||i||'(n INT) ON COMMIT DELETE ROWS';
postgres$# EXECUTE v_sql;
postgres$# END LOOP;
postgres$# END;
postgres$# $$ LANGUAGE plpgsql;
DO
postgres=# \timing
Timing is on.
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 45.471 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 27.320 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 27.482 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 26.907 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 31.055 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 28.624 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 25.277 ms

The performance has decreased by nearly 10 times. The reason is we just check if there is
operation on any ON COMMIT DELETE kind temporary relations. Regardless of how many temporary
tables are actually accessed, even if only one is accessed, it will do the truncate on all
the temporary tables.

To overcome this issue, A new list named in_use has been introduced to record the actually
accessed temporary relations, and then will do the truncate only on the actually accessed
temporary relations. And it seems works well.

After patch:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

psql (19devel)
Type "help" for help.

postgres=# DO $$
postgres$# DECLARE
postgres$# v_sql VARCHAR(100);
postgres$# BEGIN
postgres$# FOR i IN 1..3000 LOOP
postgres$# v_sql := 'CREATE LOCAL TEMP TABLE a_lttk'||i||'(n INT) ON COMMIT DELETE ROWS';
postgres$# EXECUTE v_sql;
postgres$# END LOOP;
postgres$# END;
postgres$# $$ LANGUAGE plpgsql;
DO
postgres=# \timing
Timing is on.
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 5.253 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 5.512 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 5.095 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 5.119 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 5.057 ms
postgres=# SELECT count(*) FROM a_lttk1;
count
-------
0
(1 row)

Time: 5.006 ms

Regards,
Jet

Halo Tech (www.halodbtech.com)
openHalo (www.openhalo.org)

Attachments:

improve_temporary_rel_performance_v01.patchapplication/octet-stream; charset=utf-8; name=improve_temporary_rel_performance_v01.patchDownload
diff --git a/src/backend/access/common/relation.c b/src/backend/access/common/relation.c
index 22c4cd5a256..60a9106d6fa 100644
--- a/src/backend/access/common/relation.c
+++ b/src/backend/access/common/relation.c
@@ -23,6 +23,7 @@
 #include "access/relation.h"
 #include "access/xact.h"
 #include "catalog/namespace.h"
+#include "commands/tablecmds.h"
 #include "pgstat.h"
 #include "storage/lmgr.h"
 #include "utils/inval.h"
@@ -70,8 +71,13 @@ relation_open(Oid relationId, LOCKMODE lockmode)
 
    /* Make note that we've accessed a temporary relation */
    if (RelationUsesLocalBuffers(r))
+   {
        MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE;
 
+       /* ... and mark the temporary relation accessed */
+       MarkTemporaryRelAccessed(RelationGetRelid(r));
+   }
+
    pgstat_init_relation(r);
 
    return r;
@@ -120,8 +126,13 @@ try_relation_open(Oid relationId, LOCKMODE lockmode)
 
    /* Make note that we've accessed a temporary relation */
    if (RelationUsesLocalBuffers(r))
+   {
        MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE;
 
+       /* ... and mark the temporary relation accessed */
+       MarkTemporaryRelAccessed(RelationGetRelid(r));
+   }
+
    pgstat_init_relation(r);
 
    return r;
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index c6dd2e020da..0bfe9efa40c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -129,6 +129,7 @@ typedef struct OnCommitItem
 } OnCommitItem;
 
 static List *on_commits = NIL;
+static List *in_use = NIL; /* record accessed temporary relations */
 
 
 /*
@@ -739,6 +740,7 @@ static List *GetParentedForeignKeyRefs(Relation partition);
 static void ATDetachCheckNoForeignKeyRefs(Relation partition);
 static char GetAttributeCompression(Oid atttypid, const char *compression);
 static char GetAttributeStorage(Oid atttypid, const char *storagemode);
+static bool CheckTemporaryRelAccessed(Oid relid);
 
 
 /* ----------------------------------------------------------------
@@ -19302,11 +19304,11 @@ PreCommit_on_commit_actions(void)
            case ONCOMMIT_DELETE_ROWS:
 
                /*
-                * If this transaction hasn't accessed any temporary
+                * If this transaction hasn't accessed this temporary
                 * relations, we can skip truncating ON COMMIT DELETE ROWS
                 * tables, as they must still be empty.
                 */
-               if ((MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
+               if (CheckTemporaryRelAccessed(oc->relid))
                    oids_to_truncate = lappend_oid(oids_to_truncate, oc->relid);
                break;
            case ONCOMMIT_DROP:
@@ -19375,6 +19377,10 @@ PreCommit_on_commit_actions(void)
        }
 #endif
    }
+
+   /* free accessed list */
+   list_free(in_use);
+   in_use = NIL;
 }
 
 /*
@@ -19485,6 +19491,21 @@ RangeVarCallbackMaintainsTable(const RangeVar *relation,
                       relation->relname);
 }
 
+/*
+ * Mark one temporary relation accessed
+ */
+void
+MarkTemporaryRelAccessed(Oid relid)
+{
+   MemoryContext oldctx;
+
+   oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+
+   in_use = list_append_unique_oid(in_use, relid);
+
+   MemoryContextSwitchTo(oldctx);
+}
+
 /*
  * Callback to RangeVarGetRelidExtended() for TRUNCATE processing.
  */
@@ -22057,3 +22078,27 @@ GetAttributeStorage(Oid atttypid, const char *storagemode)
 
    return cstorage;
 }
+
+
+/*
+ * Check temporary relation accessed
+ */
+static bool
+CheckTemporaryRelAccessed(Oid relid)
+{
+   ListCell    *lc;
+
+   /* quick check if temporary relation accessed in this transaction */
+   if (!(MyXactFlags & XACT_FLAGS_ACCESSEDTEMPNAMESPACE))
+       return false;
+
+   foreach(lc, in_use)
+   {
+       Oid tid = lfirst_oid(lc);
+
+       if (tid == relid)
+           return true;
+   }
+
+   return false;
+}
\ No newline at end of file
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 6832470d387..a7a27945864 100644
--- a/src/include/commands/tablecmds.h
+++ b/src/include/commands/tablecmds.h
@@ -106,5 +106,6 @@ extern void RangeVarCallbackOwnsRelation(const RangeVar *relation,
                                         Oid relId, Oid oldRelId, void *arg);
 extern bool PartConstraintImpliedByRelConstraint(Relation scanrel,
                                                 List *partConstraint);
+extern void MarkTemporaryRelAccessed(Oid relid);
 
 #endif                         /* TABLECMDS_H */
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: 章晨曦 (#1)
Re: Performance issue on temporary relations

"=?utf-8?B?56ug5pmo5pum?=" <zhangchenxi@halodbtech.com> writes:

Recently I noticed a performance issue on temporary relation. The issue will happened on
ON COMMIT DELETE temporary relations. If one session only create a few temporary relations,
well, it's fine. But if one session creates plenty of ON COMMIT DELETE kind temporary relations,
say 3,000, it will face a significant performance degradation issue.

Do you think that's supposed to be free?

To overcome this issue, A new list named in_use has been introduced to record the actually
accessed temporary relations, and then will do the truncate only on the actually accessed
temporary relations. And it seems works well.

I do not think this is something we ought to consider. It might help
certain corner use-cases, but it's probably a net loss for most.
In particular, I don't think that creating thousands of temp tables in
a session but then touching only a few of them in any one transaction
is a very plausible usage pattern.

regards, tom lane

#3章晨曦
zhangchenxi@halodbtech.com
In reply to: Tom Lane (#2)
Re: Performance issue on temporary relations

I do not think this is something we ought to consider. It might help
certain corner use-cases, but it's probably a net loss for most.
In particular, I don't think that creating thousands of temp tables in
a session but then touching only a few of them in any one transaction
is a very plausible usage pattern.

Acturely, we just facing such problem in some real systems. More than 3,700
temporary tables created! I accept such case is not that common, but it does exist.

Regards,
Jet

Halo Tech (www.halodbtech.com)
openHalo (www.openhalo.org)

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: 章晨曦 (#3)
Re: Performance issue on temporary relations

On Tue, Aug 19, 2025 at 8:45 AM 章晨曦 <zhangchenxi@halodbtech.com> wrote:

I do not think this is something we ought to consider. It might help
certain corner use-cases, but it's probably a net loss for most.
In particular, I don't think that creating thousands of temp tables in
a session but then touching only a few of them in any one transaction
is a very plausible usage pattern.

Acturely, we just facing such problem in some real systems. More than 3,700
temporary tables created! I accept such case is not that common, but it
does exist.

It is unfair to add a performance penalty to everyone just because some
people write bad code. I concur that adding complexity to the system to
gracefully handle this corner-case doesn't seem justified. A use case
description, not mere existence, is needed to provide such justification.

David J.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#4)
Re: Performance issue on temporary relations

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Tue, Aug 19, 2025 at 8:45 AM 章晨曦 <zhangchenxi@halodbtech.com> wrote:

Acturely, we just facing such problem in some real systems. More than 3,700
temporary tables created! I accept such case is not that common, but it
does exist.

It is unfair to add a performance penalty to everyone just because some
people write bad code. I concur that adding complexity to the system to
gracefully handle this corner-case doesn't seem justified. A use case
description, not mere existence, is needed to provide such justification.

Yeah, the real sub-text here is "should we believe that this
application is well designed?". It sounds like a fairly brute-force
solution.

regards, tom lane

#6章晨曦
zhangchenxi@halodbtech.com
In reply to: Tom Lane (#5)
Re: Performance issue on temporary relations

It is unfair to add a performance penalty to everyone just because some
people write bad code. I concur that adding complexity to the system to
gracefully handle this corner-case doesn't seem justified. A use case
description, not mere existence, is needed to provide such justification.

Yeah, the real sub-text here is "should we believe that this
application is well designed?". It sounds like a fairly brute-force
solution.

I agree the application is not well designed for PostgreSQL because it was migrated
from Oracle, and may not do such optimization. But back to this issue, even though
we only create 10 temporary relations, it will cause 10 truncates on every transaction.
Is that a good design?

Regards,
Jet

Halo Tech (www.halodbtech.com)
openHalo (www.openhalo.org)

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: 章晨曦 (#6)
Re: Performance issue on temporary relations

"=?utf-8?B?56ug5pmo5pum?=" <zhangchenxi@halodbtech.com> writes:

I agree the application is not well designed for PostgreSQL because it was migrated
from Oracle, and may not do such optimization. But back to this issue, even though
we only create 10 temporary relations, it will cause 10 truncates on every transaction.
Is that a good design?

[ shrug... ] If you create an ON COMMIT DELETE temp table, you
are explicitly asking for a truncation to happen at every commit.
I don't think you have much room to beef about the fact that one
happens.

Maybe you could merge some of these tables (adding an additional
key column, probably) so that a single truncate suffices for all?

regards, tom lane

#8章晨曦
zhangchenxi@halodbtech.com
In reply to: Tom Lane (#7)
Re: Performance issue on temporary relations

[ shrug... ] If you create an ON COMMIT DELETE temp table, you
are explicitly asking for a truncation to happen at every commit.
I don't think you have much room to beef about the fact that one
happens.

Yes. ON COMMIT DELETE temp table will be truncated at every commit.
But if we can control that only accessed temp tables will be truncated
may be better. When an temp tables accessed, it will be stored in in_use
list, and when on commit, it will check the in_use list if a truncation needed
to happen on that temp table, and reduce unnecessary truncations.

Regards,
Jet

Halo Tech (www.halodbtech.com)
openHalo (www.openhalo.org)

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: 章晨曦 (#8)
Re: Performance issue on temporary relations

"=?utf-8?B?56ug5pmo5pum?=" <zhangchenxi@halodbtech.com> writes:

Yes. ON COMMIT DELETE temp table will be truncated at every commit.
But if we can control that only accessed temp tables will be truncated
may be better. When an temp tables accessed, it will be stored in in_use
list, and when on commit, it will check the in_use list if a truncation needed
to happen on that temp table, and reduce unnecessary truncations.

The problem with this proposal is that you are ignoring the cost
of maintaining that list. That's going to slow down every operation
on temp tables, and in common scenarios (where a truncate would have
to happen anyway) applications will get zero benefit for the extra
overhead. So I'm not excited about adding complexity and long-term
maintenance burden to do this.

BTW, it appears to me that doing it this way is O(N^2) in the number
of active temp tables. So it's not hard to believe that the patch
as-presented would actually be a fairly serious performance drag for
some use cases with lots of temp tables. There are certainly ways
we could do better than that (hash table, bloom filter, etc) but
there would be even more engineering effort needed.

And it's also fair to wonder if you've found all the places where we'd
need to mark temp tables dirty, and what we'd need to do to be sure
we didn't introduce any oops-forgot-to-mark-it-dirty bugs in future.
If you search the commit log for mentions of bugs associated with
ON COMMIT DELETE, you'll find quite a few, which is one reason why
I'm allergic to adding more complexity here.

In short, I think you're underestimating the engineering costs and
overestimating the benefit of doing this.

regards, tom lane

#10章晨曦
zhangchenxi@halodbtech.com
In reply to: Tom Lane (#9)
Re: Performance issue on temporary relations

BTW, it appears to me that doing it this way is O(N^2) in the number
of active temp tables. So it's not hard to believe that the patch
as-presented would actually be a fairly serious performance drag for
some use cases with lots of temp tables. There are certainly ways
we could do better than that (hash table, bloom filter, etc) but
there would be even more engineering effort needed.

Yes, you're right. I also consider using like hash table to do more better and try
to merge the in_use list and on_commits list into one hashtable. But, as just you
said, it needs much more effort. Thanks any way.

Regards,
Jet

Halo Tech (www.halodbtech.com)
openHalo (www.openhalo.org)