BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

Started by PG Bug reporting form9 days ago17 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 19520
Logged by: zhanglihui
Email address: zlh21343@163.com
PostgreSQL version: 19beta1
Operating system: Ubuntu 25.04
Description:

=== Configuration ===
Enabled extensions: pg_stat_statements
postgresql.conf:
shared_preload_libraries = 'pg_stat_statements'
track_functions = 'all'

=== Problem Description ===
PostgreSQL server throws PANIC under high concurrent create, CALL and DROP
of stored procedures.
This issue **only reproduces when pg_stat_statements is enabled and
track_functions = all**.
It cannot be triggered if pg_stat_statements is disabled or track_functions
is set to none/pl.

=== Run Steps ===
javac -cp postgresql-42.7.5.jar -d out src/ConcurrentSqlTest.java
src/ProcCrashReprodure.java
# Terminal 1 — Mixed DDL/DML load test (40 threads, 10000 iterations each)
java -cp out:postgresql-42.7.5.jar ConcurrentSqlTest

# Terminal 2 — Pure CALL load test (20 threads, infinite loop)
java -cp out:postgresql-42.7.5.jar ProcCrashReprodure
# Note: If no PANIC log or core dump is generated after the execution of
Terminal 1, please re-run the command repeatedly until the issue occurs.

PANIC log:
postgresql-2026-06-14_235441.log:2026-06-14 23:54:41.949 CST [691761] PANIC:
XX000: cannot abort transaction 4166281, it was already committed
postgresql-2026-06-14_235931.log:2026-06-14 23:59:31.556 CST [696980] PANIC:
XX000: cannot abort transaction 4641943, it was already committed

(gdb) bt
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=6,
no_tid=0) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (threadid=<optimized out>, signo=6) at
./nptl/pthread_kill.c:89
#2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at
./nptl/pthread_kill.c:100
#3 0x000073e62ec4579e in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4 0x000073e62ec288cd in __GI_abort () at ./stdlib/abort.c:73
#5 0x000056aeb574138c in errfinish (filename=0x56aeb57ff545 "xact.c",
lineno=1835, funcname=0x56aeb5800d00 <__func__.26> "RecordTransactionAbort")
at elog.c:621
#6 0x000056aeb4faae67 in RecordTransactionAbort (isSubXact=false) at
xact.c:1835
#7 0x000056aeb4fac19f in AbortTransaction () at xact.c:2982
#8 0x000056aeb4facc22 in AbortCurrentTransactionInternal () at xact.c:3553
#9 0x000056aeb4facb93 in AbortCurrentTransaction () at xact.c:3507
#10 0x000056aeb5525552 in PostgresMain (dbname=0x56aedecf30d0 "postgres",
username=0x56aedecf30b0 "zlh_user") at postgres.c:4539
#11 0x000056aeb551b59c in BackendMain (startup_data=0x7fffb9501000,
startup_data_len=24) at backend_startup.c:124
#12 0x000056aeb5405686 in postmaster_child_launch (child_type=B_BACKEND,
child_slot=24, startup_data=0x7fffb9501000, startup_data_len=24,
client_sock=0x7fffb9501060) at launch_backend.c:268
#13 0x000056aeb540c11b in BackendStartup (client_sock=0x7fffb9501060) at
postmaster.c:3627
#14 0x000056aeb540969f in ServerLoop () at postmaster.c:1728
#15 0x000056aeb5408f7c in PostmasterMain (argc=1, argv=0x56aedeca1430) at
postmaster.c:1415
#16 0x000056aeb528ce51 in main (argc=1, argv=0x56aedeca1430) at main.c:231

---- File 1: ConcurrentSqlTest.java (Mixed DDL/DML test, 40 threads × 10000
iterations) ----
import java.sql.*;
import java.util.*;
import java.util.concurrent.*;
import java.util.concurrent.atomic.AtomicInteger;

/**
* PostgreSQL concurrent DDL/DML stress test — reproduces backend crash
* (signal 5 / SIGTRAP) when DROP/CREATE PROCEDURE and CALL PROCEDURE
* execute concurrently at high concurrency.
*
* Compile: javac -cp postgresql-42.7.5.jar ConcurrentSqlTest.java
* Run: java -cp .:postgresql-42.7.5.jar ConcurrentSqlTest
*
* Optional env vars: PG_HOST, PG_PORT, PG_DATABASE, PG_USER, PG_PASSWORD
*/
public final class ConcurrentSqlTest {

// ── connection defaults ──
private static final String HOST = envOr("PG_HOST",
"192.168.239.128");
private static final String PORT = envOr("PG_PORT", "5432");
private static final String DATABASE = envOr("PG_DATABASE", "postgres");
private static final String USER = envOr("PG_USER", "zlh_user");
private static final String PASSWORD = envOr("PG_PASSWORD",
"Gauss@123");
private static final String JDBC_URL =
"jdbc:postgresql://" + HOST + ":" + PORT + "/" + DATABASE;

// ── test parameters ──
private static final int THREADS = 40;
private static final int ITERATIONS = 10_000;

// ── SQL batches ──
private static final String[] SQL = {
"DROP PROCEDURE IF EXISTS proc_test",
"CREATE OR REPLACE PROCEDURE proc_test()\n"
+ "LANGUAGE plpgsql\n"
+ "AS $$\n"
+ "BEGIN\n"
+ "END;\n"
+ "$$",
"CALL proc_test()",
};

// ── counters ──
private static final AtomicInteger totalOk = new AtomicInteger(0);
private static final AtomicInteger totalFail = new AtomicInteger(0);

static {
try { Class.forName("org.postgresql.Driver"); }
catch (ClassNotFoundException e) {
System.err.println("PostgreSQL JDBC driver not found on
classpath");
System.exit(1);
}
}

public static void main(String[] args) throws InterruptedException {
System.out.println("=== PostgreSQL concurrent DROP/CREATE/CALL
PROCEDURE ===");
System.out.println("URL: " + JDBC_URL);
System.out.println("Threads: " + THREADS + " | Iterations/thread: "
+ ITERATIONS);
System.out.println("Watch for: backend terminated by signal 5
(SIGTRAP)");
System.out.println("========================================================");

long t0 = System.currentTimeMillis();
ExecutorService pool = Executors.newFixedThreadPool(THREADS);
List<Future<?>> futures = new ArrayList<>();

for (int i = 0; i < THREADS; i++) {
final int tid = i;
futures.add(pool.submit(() -> runWorker(tid)));
}

Runtime.getRuntime().addShutdownHook(new Thread(() -> {
System.out.println("\nShutting down...");
pool.shutdownNow();
}));

// drain
for (Future<?> f : futures) {
try { f.get(); } catch (ExecutionException e) {
System.err.println("Thread crashed: " +
e.getCause().getMessage());
} catch (CancellationException ignored) { }
}
pool.shutdown();

long elapsed = System.currentTimeMillis() - t0;
System.out.println("========================================================");
System.out.printf("Done. OK: %d FAIL: %d Time: %.1fs%n",
totalOk.get(), totalFail.get(), elapsed / 1000.0);
System.exit(totalFail.get() > 0 ? 1 : 0);
}

// ── single worker (reuses one connection) ──
private static void runWorker(int tid) {
int ok = 0, fail = 0;
Connection c = openConnection();
for (int i = 0; i < ITERATIONS; i++) {
if (Thread.currentThread().isInterrupted()) break;
try {
for (String sql : SQL) {
try (Statement s = c.createStatement()) {
s.execute(sql);
}
}
ok++;
} catch (SQLException e) {
fail++;
if (totalFail.incrementAndGet() <= 10) {
System.err.printf("Thread %d loop %d: [%s] %s%n",
tid, i, e.getSQLState(),
e.getMessage().replace('\n', ' '));
}
// reconnect if server closed the connection (e.g. after
crash)
try {
if (c.isClosed()) c = openConnection();
} catch (SQLException ignored) { }
}
}
totalOk.addAndGet(ok);
close(c);
System.out.printf("Thread %2d: %d ok / %d fail%n", tid, ok, fail);
}

private static Connection openConnection() {
try {
Connection c = DriverManager.getConnection(JDBC_URL, USER,
PASSWORD);
c.setAutoCommit(true);
return c;
} catch (SQLException e) {
throw new RuntimeException("Failed to connect: " +
e.getMessage(), e);
}
}

private static void close(Connection c) {
try { if (c != null) c.close(); } catch (SQLException ignored) { }
}

private static String envOr(String key, String def) {
String v = System.getenv(key);
return (v != null && !v.trim().isEmpty()) ? v : def;
}
}

---- File 2: ProcCrashReprodure.java (Pure CALL test, 20 threads, infinite
loop) ----
import java.sql.*;
import java.util.concurrent.*;
import java.util.concurrent.atomic.AtomicBoolean;

/**
* PostgreSQL concurrent CALL stress test — run this together with
* ConcurrentSqlTest to reproduce backend crash (signal 5 / SIGTRAP).
*
* This program does ONLY repeated CALL proc_test() across 20 threads
* (infinite loop). ConcurrentSqlTest does DROP → CREATE → CALL in
* a loop. Run both simultaneously.
*
* Compile: javac -cp postgresql-42.7.5.jar ProcCrashReprodure.java
* Run: java -cp .:postgresql-42.7.5.jar ProcCrashReprodure
*
* Optional env vars: PG_HOST, PG_PORT, PG_DATABASE, PG_USER, PG_PASSWORD
*/
public final class ProcCrashReprodure {

private static final String HOST = envOr("PG_HOST",
"192.168.239.128");
private static final String PORT = envOr("PG_PORT", "5432");
private static final String DATABASE = envOr("PG_DATABASE", "postgres");
private static final String USER = envOr("PG_USER", "zlh_user");
private static final String PASSWORD = envOr("PG_PASSWORD",
"Gauss@123");
private static final String JDBC_URL =
"jdbc:postgresql://" + HOST + ":" + PORT + "/" + DATABASE;

private static final int THREADS = 20;

private static final AtomicBoolean stop = new AtomicBoolean(false);

static {
try { Class.forName("org.postgresql.Driver"); }
catch (ClassNotFoundException e) {
System.err.println("PostgreSQL JDBC driver not found on
classpath");
System.exit(1);
}
}

public static void main(String[] args) {
System.out.println("=== PostgreSQL CALL stress test (run
ConcurrentSqlTest too) ===");
System.out.println("URL: " + JDBC_URL);
System.out.println("Threads: " + THREADS + " | Loop: infinite
(Ctrl+C to stop)");
System.out.println("Watch for: backend terminated by signal 5
(SIGTRAP)");
System.out.println("================================================================");

ExecutorService pool = Executors.newFixedThreadPool(THREADS);
CountDownLatch latch = new CountDownLatch(THREADS);

for (int i = 0; i < THREADS; i++) {
final int tid = i;
pool.submit(() -> {
try {
runWorker(tid);
} finally {
latch.countDown();
}
});
}

Runtime.getRuntime().addShutdownHook(new Thread(() -> {
System.out.println("\nShutting down...");
stop.set(true);
pool.shutdownNow();
}));

try { latch.await(); }
catch (InterruptedException e) { Thread.currentThread().interrupt();
}

pool.shutdownNow();
System.out.println("Done.");
}

private static void runWorker(int tid) {
while (!stop.get()) {
try (Connection c = DriverManager.getConnection(JDBC_URL, USER,
PASSWORD)) {
c.setAutoCommit(true);
while (!stop.get()) {
try (CallableStatement cs = c.prepareCall("{ CALL
proc_test() }")) {
cs.execute();
} catch (SQLException e) {
// expected when proc is being dropped/recreated
if (c.isClosed()) break; // reconnect
}
}
} catch (SQLException e) {
// connection failure — pause then reconnect
try { Thread.sleep(100); }
catch (InterruptedException ie) { break; }
}
}
}

private static String envOr(String key, String def) {
String v = System.getenv(key);
return (v != null && !v.trim().isEmpty()) ? v : def;
}
}

#2Ayush Tiwari
ayushtiwari.slg01@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

Hi,

On Sun, 14 Jun 2026 at 22:37, PG Bug reporting form <noreply@postgresql.org>
wrote:

The following bug has been logged on the website:

Bug reference: 19520
Logged by: zhanglihui
Email address: zlh21343@163.com
PostgreSQL version: 19beta1
Operating system: Ubuntu 25.04
Description:

=== Configuration ===
Enabled extensions: pg_stat_statements
postgresql.conf:
shared_preload_libraries = 'pg_stat_statements'
track_functions = 'all'

=== Problem Description ===
PostgreSQL server throws PANIC under high concurrent create, CALL and DROP
of stored procedures.
This issue **only reproduces when pg_stat_statements is enabled and
track_functions = all**.
It cannot be triggered if pg_stat_statements is disabled or track_functions
is set to none/pl.

=== Run Steps ===
javac -cp postgresql-42.7.5.jar -d out src/ConcurrentSqlTest.java
src/ProcCrashReprodure.java
# Terminal 1 — Mixed DDL/DML load test (40 threads, 10000 iterations each)
java -cp out:postgresql-42.7.5.jar ConcurrentSqlTest

# Terminal 2 — Pure CALL load test (20 threads, infinite loop)
java -cp out:postgresql-42.7.5.jar ProcCrashReprodure
# Note: If no PANIC log or core dump is generated after the execution of
Terminal 1, please re-run the command repeatedly until the issue occurs.

PANIC log:
postgresql-2026-06-14_235441.log:2026-06-14 23:54:41.949 CST [691761]
PANIC:
XX000: cannot abort transaction 4166281, it was already committed
postgresql-2026-06-14_235931.log:2026-06-14 23:59:31.556 CST [696980]
PANIC:
XX000: cannot abort transaction 4641943, it was already committed

(gdb) bt
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=6,
no_tid=0) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (threadid=<optimized out>, signo=6) at
./nptl/pthread_kill.c:89
#2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at
./nptl/pthread_kill.c:100
#3 0x000073e62ec4579e in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4 0x000073e62ec288cd in __GI_abort () at ./stdlib/abort.c:73
#5 0x000056aeb574138c in errfinish (filename=0x56aeb57ff545 "xact.c",
lineno=1835, funcname=0x56aeb5800d00 <__func__.26>
"RecordTransactionAbort")
at elog.c:621
#6 0x000056aeb4faae67 in RecordTransactionAbort (isSubXact=false) at
xact.c:1835
#7 0x000056aeb4fac19f in AbortTransaction () at xact.c:2982
#8 0x000056aeb4facc22 in AbortCurrentTransactionInternal () at xact.c:3553
#9 0x000056aeb4facb93 in AbortCurrentTransaction () at xact.c:3507
#10 0x000056aeb5525552 in PostgresMain (dbname=0x56aedecf30d0 "postgres",
username=0x56aedecf30b0 "zlh_user") at postgres.c:4539
#11 0x000056aeb551b59c in BackendMain (startup_data=0x7fffb9501000,
startup_data_len=24) at backend_startup.c:124
#12 0x000056aeb5405686 in postmaster_child_launch (child_type=B_BACKEND,
child_slot=24, startup_data=0x7fffb9501000, startup_data_len=24,
client_sock=0x7fffb9501060) at launch_backend.c:268
#13 0x000056aeb540c11b in BackendStartup (client_sock=0x7fffb9501060) at
postmaster.c:3627
#14 0x000056aeb540969f in ServerLoop () at postmaster.c:1728
#15 0x000056aeb5408f7c in PostmasterMain (argc=1, argv=0x56aedeca1430) at
postmaster.c:1415
#16 0x000056aeb528ce51 in main (argc=1, argv=0x56aedeca1430) at main.c:231

Thanks for the report! I'm unsure if this has already been reported or
not.

I looked into this the last day, I could reproduce it locally. Rather
than the Java harness I used ~60 concurrent psql clients looping DROP /
CREATE OR REPLACE / CALL of the same empty plpgsql procedure
(track_functions=all, pg_stat_statements loaded); here it PANICs within
a few seconds.

Just before the PANIC the failing backend logs:

ERROR: trying to drop stats entry already dropped: kind=function ...
WARNING: AbortTransaction while in COMMIT state
PANIC: cannot abort transaction xxx, it was already committed

So it looks like a function's shared stats entry gets dropped twice:
once out-of-band from pgstat_init_function_usage() when a concurrent
CALL notices the function is gone, and once from the transactional drop
at DROP time. When the latter loses the race it runs from
AtEOXact_PgStat(), past the commit record, so the "already dropped"
elog() in pgstat_drop_entry_internal() becomes the PANIC.

The two droppers and the guard all seem to date back to PG 15
(5891c7a8ed8f). I guess the "drop exactly once" assumption behind that
guard doesn't really hold for function stats, where two independent
droppers are legitimate.

I've added Andres and Michael on the thread, since they have worked on
this in the past, for their input.

Regards,
Ayush

#3Michael Paquier
michael@paquier.xyz
In reply to: Ayush Tiwari (#2)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On Mon, Jun 15, 2026 at 02:44:06PM +0530, Ayush Tiwari wrote:

I've added Andres and Michael on the thread, since they have worked on
this in the past, for their input.

Thanks for the poke. I have marked this thread as something to look
at, but was not able to get back to it. Will investigate..
--
Michael

#4Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#3)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On Tue, Jun 16, 2026 at 08:24:51AM +0900, Michael Paquier wrote:

On Mon, Jun 15, 2026 at 02:44:06PM +0530, Ayush Tiwari wrote:

I've added Andres and Michael on the thread, since they have worked on
this in the past, for their input.

Thanks for the poke. I have marked this thread as something to look
at, but was not able to get back to it. Will investigate..

As far as I can see, pgss is not really a requirement. Your case is
taking advantage of the module introducing more slowness to enlarge
the reproduction window. Now saying that pgss being slow is a good
thing, it's bad, but it helps here. I've tried to reproduce in three
environments, only my mac is able to get something, because it's
slower I guess..

Attached is a script able to reproduce the issue in bash, courtesy of
Claude because java and I sum up to a value very close to 0, see
test_bug19520.txt. The trick of the script is the same as your
scenario, with two concurrent workloads:
- One with DROP PROC/CREATE PROC/CALL.
- One with CALL

I had much more success after adding two sleeps to enlarge the
conflict window, see also the sleep.patch attached, for reference.

Finally attached is a patch, where I'd like to propose the
introduction of a path in pgstat_drop_entry() to make the routine able
to accept double drops.

The big comment within pgstat_init_function_usage() documents why it
does its stuff for track_functions, so I was wondering if we should
enforce the same double-drop-acceptance rule for all the callers
everybody, but I also see a point in the correctness, by allowing the
caller to complain if we try to do double drops but error on them,
pointing to a programming error. Note that
pgstat_drop_entry_internal() is not touched on purpose, to keep the
database-level scans as they are, with double-drops forbidden.

This patch is very close to what Sami has posted on his PGSS thread,
v3-0002, using a missing_ok instead of a skip_dropped:
/messages/by-id/CAA5RZ0uoxiQ2_=xHGRnyc4WdM9aR0fzdMhBubnw97po==--yGQ@mail.gmail.com
I didn't suspect that we would need something like that for a
backpatch, but well.

I'm adding Sami in CC in case he wishes to comment on this patch, and
Horiguchi-san as this area of the code concerns him.

Thoughts or comments welcome.
--
Michael

Attachments:

sleep.patchtext/plain; charset=us-asciiDownload+3-0
test_bug19520.txttext/plain; charset=us-asciiDownload
0001-Fix-potential-PANICs-with-concurrent-drop-of-pgstats.patchtext/plain; charset=us-asciiDownload+26-11
#5Ayush Tiwari
ayushtiwari.slg01@gmail.com
In reply to: Michael Paquier (#4)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

Hi,

On Wed, 17 Jun 2026 at 09:45, Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Jun 16, 2026 at 08:24:51AM +0900, Michael Paquier wrote:

On Mon, Jun 15, 2026 at 02:44:06PM +0530, Ayush Tiwari wrote:

I've added Andres and Michael on the thread, since they have worked on
this in the past, for their input.

Thanks for the poke. I have marked this thread as something to look
at, but was not able to get back to it. Will investigate..

As far as I can see, pgss is not really a requirement. Your case is
taking advantage of the module introducing more slowness to enlarge
the reproduction window. Now saying that pgss being slow is a good
thing, it's bad, but it helps here. I've tried to reproduce in three
environments, only my mac is able to get something, because it's
slower I guess..

Yeah, you are right, pgss is not a requirement, it just
makes the delay broader.

Attached is a script able to reproduce the issue in bash, courtesy of
Claude because java and I sum up to a value very close to 0, see
test_bug19520.txt. The trick of the script is the same as your
scenario, with two concurrent workloads:
- One with DROP PROC/CREATE PROC/CALL.
- One with CALL

I had much more success after adding two sleeps to enlarge the
conflict window, see also the sleep.patch attached, for reference.

Finally attached is a patch, where I'd like to propose the
introduction of a path in pgstat_drop_entry() to make the routine able
to accept double drops.

I applied the patch on HEAD and ran my psql harness against it (~60
clients looping DROP / CREATE OR REPLACE / CALL, track_functions=all,
pgss loaded). Unpatched it PANICs within seconds; with the patch it
stayed up for a ~3 minute run, with the out-of-band drop path firing
several thousand times. So it clearly closes the hole here.

The big comment within pgstat_init_function_usage() documents why it
does its stuff for track_functions, so I was wondering if we should
enforce the same double-drop-acceptance rule for all the callers
everybody, but I also see a point in the correctness, by allowing the
caller to complain if we try to do double drops but error on them,
pointing to a programming error. Note that
pgstat_drop_entry_internal() is not touched on purpose, to keep the
database-level scans as they are, with double-drops forbidden.

This patch is very close to what Sami has posted on his PGSS thread,
v3-0002, using a missing_ok instead of a skip_dropped:

/messages/by-id/CAA5RZ0uoxiQ2_=xHGRnyc4WdM9aR0fzdMhBubnw97po==--yGQ@mail.gmail.com
I didn't suspect that we would need something like that for a
backpatch, but well.

I'm adding Sami in CC in case he wishes to comment on this patch, and
Horiguchi-san as this area of the code concerns him.

Thoughts or comments welcome.

A couple of things which I'm not clear about (these are not blockers
just questions for my understanding):

- With the check moved into the wrapper, pgstat_drop_entry_internal()
still keeps its own "already dropped" elog(). Every path into
_internal now seems to guarantee the entry isn't dropped, so
_internal's copy looks unreachable after the patch
,and it's the one with the richer refcount/generation detail. Was
the idea to leave it as a backstop, or would folding the handling into
one place (or making _internal's an Assert) be cleaner?

- In the missing_ok path the wrapper returns true, so the post-commit
caller skips the not_freed_count++/GC request that a "real" not-freed
drop would do. That seems harmless since the entry self-heals
but was returning true there a deliberate choice over mirroring
the not-freed/false path? I need to take a look again at this, maybe
I missed something.

Regards,
Ayush

#6Sami Imseih
samimseih@gmail.com
In reply to: Ayush Tiwari (#5)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

This patch is very close to what Sami has posted on his PGSS thread,
v3-0002, using a missing_ok instead of a skip_dropped:
/messages/by-id/CAA5RZ0uoxiQ2_=xHGRnyc4WdM9aR0fzdMhBubnw97po==--yGQ@mail.gmail.com
I didn't suspect that we would need something like that for a
backpatch, but well.

Right, my intention was just adding infrastructure to make tolerating
a dropped entry possible and to be used by an extension in the future.
But, this bug report is timely and now it looks like we need this for
race conditions that are possible in core.

A couple of things which I'm not clear about (these are not blockers
just questions for my understanding):

- With the check moved into the wrapper, pgstat_drop_entry_internal()
still keeps its own "already dropped" elog(). Every path into
_internal now seems to guarantee the entry isn't dropped, so
_internal's copy looks unreachable after the patch
,and it's the one with the richer refcount/generation detail.

Right. Michael's approach of moving the ERROR into the wrapper is better
than keeping it in _internal. Since after the patch no caller enters
pgstat_drop_entry_internal() with a dropped entry
(pgstat_drop_database_and_contents() and pgstat_drop_matching_entries()
already filtered them out, and now the wrapper does too)

Was the idea to leave it as a backstop, or would folding the handling into
one place (or making _internal's an Assert) be cleaner?

The check in _internal should be converted to an Assert. This documents
that callers must only pass "live" entries, which will be the case
for all callers after
the patch

- In the missing_ok path the wrapper returns true, so the post-commit
caller skips the not_freed_count++/GC request that a "real" not-freed
drop would do. That seems harmless since the entry self-heals
but was returning true there a deliberate choice over mirroring
the not-freed/false path? I need to take a look again at this, maybe
I missed something.

Finding an already dropped entry tells me that the first caller to drop the
entry also triggered a gc request, so we should not request it again.

--
Sami Imseih
Amazon Web Services (AWS)

#7Michael Paquier
michael@paquier.xyz
In reply to: Sami Imseih (#6)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On Wed, Jun 17, 2026 at 03:26:33PM -0500, Sami Imseih wrote:

Was the idea to leave it as a backstop, or would folding the handling into
one place (or making _internal's an Assert) be cleaner?

The check in _internal should be converted to an Assert. This documents
that callers must only pass "live" entries, which will be the case
for all callers after
the patch

Yeah, I was hesitating to do so, but perhaps you are right that there
is little meaning in keeping this extra elog() anymore in the internal
routine: all its callers discard entries marked as dropped. And we do
so while holding an exclusive lock.

- In the missing_ok path the wrapper returns true, so the post-commit
caller skips the not_freed_count++/GC request that a "real" not-freed
drop would do. That seems harmless since the entry self-heals
but was returning true there a deliberate choice over mirroring
the not-freed/false path? I need to take a look again at this, maybe
I missed something.

Finding an already dropped entry tells me that the first caller to drop the
entry also triggered a gc request, so we should not request it again.

Nope, we should not trigger multiple requests.

Attaching an updated patch for now. I am still testing it locally
across all the branches to make sure that the issue is gone (that
takes quite a bit of time). I'll probably apply it in a few hours
down to v15 if nothing pops up.
--
Michael

Attachments:

v2-0001-Fix-potential-PANICs-with-concurrent-drop-of-pgst.patchtext/plain; charset=us-asciiDownload+27-19
#8Ayush Tiwari
ayushtiwari.slg01@gmail.com
In reply to: Michael Paquier (#7)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

Hi,

On Thu, 18 Jun 2026 at 06:33, Michael Paquier <michael@paquier.xyz> wrote:

Attaching an updated patch for now. I am still testing it locally
across all the branches to make sure that the issue is gone (that
takes quite a bit of time). I'll probably apply it in a few hours
down to v15 if nothing pops up.

The v2 patch looks fine to me. Tested it, no PANICs.

And the assert too looks good to me.

Regards,

#9Michael Paquier
michael@paquier.xyz
In reply to: Ayush Tiwari (#8)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On Thu, Jun 18, 2026 at 08:52:49AM +0530, Ayush Tiwari wrote:

The v2 patch looks fine to me. Tested it, no PANICs.

The buildfarm has backfired with the ABI compliance check, which has
made me double-check for some extension code where the API change
would matter, based on this list:
https://wiki.postgresql.org/wiki/CustomCumulativeStats

And I did see one spot here, where there is a
pgstat_custom_drop_entry() that maps to a definition of
pgstat_drop_entry():
https://github.com/pganalyze/pg_stat_plans

I'll go file a ticket.
--
Michael

#10Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#9)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On Thu, Jun 18, 2026 at 02:40:46PM +0900, Michael Paquier wrote:

On Thu, Jun 18, 2026 at 08:52:49AM +0530, Ayush Tiwari wrote:

The v2 patch looks fine to me. Tested it, no PANICs.

The buildfarm has backfired with the ABI compliance check, which has
made me double-check for some extension code where the API change
would matter, based on this list:
https://wiki.postgresql.org/wiki/CustomCumulativeStats

Another thing I have forgotten to mention regarding 850b9218c8e4..
I have tweaked the patch so as we still show the refcount and the
generation in the error message, something that was missed in the
latest version of the patch posted on this thread. This information
is useful for debugging purposes.
--
Michael

#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#9)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On 2026-Jun-18, Michael Paquier wrote:

And I did see one spot here, where there is a
pgstat_custom_drop_entry() that maps to a definition of
pgstat_drop_entry():
https://github.com/pganalyze/pg_stat_plans

I'll go file a ticket.

Is this really the right reaction? As you know, for the extension
developers it is much more difficult to handle the ABI change on their
side, because they need to force all their users to update the extension
prior to updating Postgres. This is pretty difficult to do normally and
can lead to crashes on production. As extension developer you can try
to communicate this, but for end users it is quite easy to miss it.

I think a better answer is to just not introduce the ABI change in
stable branches. That is, I think we should add a shim function so that
the third-party extensions can continue to use the original ABI; and
only in master you clean that up with a different API, whereby the
extension will be forced to have an #ifdef block for the 19 version or
the older versions, but that's fine because the extension has to be
recompiled for the new major version anyway so the end-user won't be
affected on a minor upgrade.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/

#12Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#11)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On Sat, Jun 20, 2026 at 01:09:59PM +0200, Alvaro Herrera wrote:

I think a better answer is to just not introduce the ABI change in
stable branches. That is, I think we should add a shim function so that
the third-party extensions can continue to use the original ABI; and
only in master you clean that up with a different API, whereby the
extension will be forced to have an #ifdef block for the 19 version or
the older versions, but that's fine because the extension has to be
recompiled for the new major version anyway so the end-user won't be
affected on a minor upgrade.

If you feel strongly about it, we could just do something like the
attached in the v15-v18 range. This introduces a new routine called
pgstat_drop_entry_ext() that gains the new argument "missing_ok", and
pgstat_drop_entry() would be an ABI-compatible wrapper calling it.

What do you think?
--
Michael

Attachments:

0001-Re-introduce-pgstat_drop_entry-keeping-ABI-compatibi.patchtext/plain; charset=us-asciiDownload+29-18
#13Lukas Fittl
lukas@fittl.com
In reply to: Michael Paquier (#12)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On Sat, Jun 20, 2026 at 5:16 AM Michael Paquier <michael@paquier.xyz> wrote:

On Sat, Jun 20, 2026 at 01:09:59PM +0200, Alvaro Herrera wrote:

I think a better answer is to just not introduce the ABI change in
stable branches. That is, I think we should add a shim function so that
the third-party extensions can continue to use the original ABI; and
only in master you clean that up with a different API, whereby the
extension will be forced to have an #ifdef block for the 19 version or
the older versions, but that's fine because the extension has to be
recompiled for the new major version anyway so the end-user won't be
affected on a minor upgrade.

If you feel strongly about it, we could just do something like the
attached in the v15-v18 range. This introduces a new routine called
pgstat_drop_entry_ext() that gains the new argument "missing_ok", and
pgstat_drop_entry() would be an ABI-compatible wrapper calling it.

What do you think?

As the developer of the extension in the picture, I was actually
surprised to see the commit making an ABI *and* API breaking change
(and I think with cumulative stats being pluggable in 18, we can
expect people to use public stats functions like pgstat_drop_entry in
extensions), precisely because of the issues Alvaro mentioned.

I think doing it with a new routine is what I would have expected to
happen to preserve API and ABI compatibility, and your follow-up patch
looks good to me.

Thanks,
Lukas

--
Lukas Fittl

#14Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#12)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On 2026-Jun-20, Michael Paquier wrote:

If you feel strongly about it, we could just do something like the
attached in the v15-v18 range. This introduces a new routine called
pgstat_drop_entry_ext() that gains the new argument "missing_ok", and
pgstat_drop_entry() would be an ABI-compatible wrapper calling it.

What do you think?

Yeah, this sounds more or less reasonable. The callers that pass
missing_ok=false could still use the original function name though, no?

(Personally I would do for an ABI compatibility in back branches with
this new function, and an API breakage in master by simply adding the
new argument everywhere, but keeping the old function name. This way we
don't preserve unnecessary API ugliness forever.)

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"¿Qué importan los años? Lo que realmente importa es comprobar que
a fin de cuentas la mejor edad de la vida es estar vivo" (Mafalda)

#15Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#14)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On Sat, Jun 20, 2026 at 06:58:27PM +0200, Alvaro Herrera wrote:

Yeah, this sounds more or less reasonable. The callers that pass
missing_ok=false could still use the original function name though, no?

(Personally I would do for an ABI compatibility in back branches with
this new function, and an API breakage in master by simply adding the
new argument everywhere, but keeping the old function name. This way we
don't preserve unnecessary API ugliness forever.)

Yes, that would be the idea:
- On HEAD, keep the old function name, add the parameter.
- On the back-branches, use the new function name with the new
parameter. And contrary to you limit the use of the old function
name.

Using the old function name in the back-branches where missing_ok is
false would also work, of course. My suggestion just makes one less
call showing up on the stack. The previous patch posted is not for
HEAD, only for v15~v18.
--
Michael

#16Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#15)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On Sun, Jun 21, 2026 at 07:51:25AM +0900, Michael Paquier wrote:

Using the old function name in the back-branches where missing_ok is
false would also work, of course. My suggestion just makes one less
call showing up on the stack. The previous patch posted is not for
HEAD, only for v15~v18.

By the way, regarding .abi-compliance-history, I am planning to remove
the latest entry after reading how ABICompCheck.pm works in the
buildfarm code. It uses the latest commit as a base point of
comparison, and compares it with the latest commit specified in the
ABI file.

Removing the entry in the same commit that adjusts the pgstats routine
to be ABI-compatible should work, and there is no point in adding an
extra entry to re-document the opposite ABI change. Any comments
perhaps?
--
Michael

#17Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#16)
Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

On Mon, Jun 22, 2026 at 05:28:58PM +0900, Michael Paquier wrote:

By the way, regarding .abi-compliance-history, I am planning to remove
the latest entry after reading how ABICompCheck.pm works in the
buildfarm code. It uses the latest commit as a base point of
comparison, and compares it with the latest commit specified in the
ABI file.

And adjusted things on v15~v18 as of fe464e9e6863.
--
Michael