BUG #19500: pgrepack logical decoding plugin can crash assert builds via SQL decoding API

Started by PG Bug reporting form5 days ago5 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 19500
Logged by: Nikita Kalinin
Email address: n.kalinin@postgrespro.ru
PostgreSQL version: 18.4
Operating system: Fedora 44
Description:

Hi,
It appears that the pgrepack output plugin is accessible through the SQL
logical decoding API, even though the plugin code explicitly indicates that
this interface is not supported. Reading changes from such a slot can cause
a backend process crash in builds with asserts enabled.
The crash is reproducible on the current master branch. Since the web form
does not allow selecting master, I selected the latest available released
version instead.

Steps to reproduce:
CREATE TABLE rp(a int);
SELECT *
FROM pg_create_logical_replication_slot('s_repack', 'pgrepack');
INSERT INTO rp VALUES (1);
SELECT *
FROM pg_logical_slot_get_binary_changes('s_repack', NULL, NULL);

Server log:

2026-05-28 21:32:23.185 +07 [142878] STATEMENT: SELECT *
FROM pg_create_logical_replication_slot('s_repack', 'pgrepack');
TRAP: failed Assert("RelationGetRelid(relation) == private->relid"), File:
"pgrepack.c", Line: 100, PID: 142878
postgres: nkpit postgres [local] SELECT(ExceptionalCondition+0x57)
[0xa2d437]
/tmp/pg/lib/postgresql/pgrepack.so(+0xa99) [0x7f7dd9332a99]

Backtrace:
#0 __pthread_kill_implementation (threadid=<optimized out>,
signo=signo@entry=6,
no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007f7dd807a8d3 in __pthread_kill_internal (threadid=<optimized out>,
signo=6)
at pthread_kill.c:89
#2 0x00007f7dd801f48e in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#3 0x00007f7dd80067b3 in __GI_abort () at abort.c:77
#4 0x0000000000a2d458 in ExceptionalCondition (
conditionName=conditionName@entry=0x7f7dd93337c8
"RelationGetRelid(relation) == private->relid",
fileName=fileName@entry=0x7f7dd93337f5 "pgrepack.c",
lineNumber=lineNumber@entry=100) at assert.c:65
#5 0x00007f7dd9332a99 in repack_process_change (ctx=<optimized out>,
txn=<optimized out>, relation=<optimized out>, change=<optimized out>)
at pgrepack.c:100
#6 0x0000000000821223 in change_cb_wrapper (cache=<optimized out>,
txn=<optimized out>, relation=<optimized out>, change=<optimized out>)
at logical.c:1111
#7 0x000000000082d91b in ReorderBufferApplyChange (rb=<optimized out>,
txn=<optimized out>, relation=0x7f7dd848f7e8, change=0x29f71f10,
streaming=false)
at reorderbuffer.c:2080
#8 ReorderBufferProcessTXN (rb=0x29f55f20, txn=0x29f49e30,
commit_lsn=25673024,
snapshot_now=<optimized out>, command_id=command_id@entry=0,
streaming=streaming@entry=false) at reorderbuffer.c:2387
#9 0x000000000082dca9 in ReorderBufferReplay (txn=<optimized out>,
rb=<optimized out>, commit_lsn=<optimized out>, end_lsn=<optimized out>,
commit_time=<optimized out>, origin_id=<optimized out>, origin_lsn=0,
xid=<optimized out>) at reorderbuffer.c:2872
#10 0x000000000082ea38 in ReorderBufferCommit (rb=<optimized out>,
xid=<optimized out>, commit_lsn=<optimized out>, end_lsn=<optimized
out>,
commit_time=<optimized out>, origin_id=<optimized out>,
origin_lsn=<optimized out>)
at reorderbuffer.c:2896
#11 0x000000000081d075 in DecodeCommit (ctx=0x29f3de70, buf=0x7ffe263bd7e0,
parsed=0x7ffe263bd630, xid=695, two_phase=false) at decode.c:755
#12 xact_decode (ctx=0x29f3de70, buf=0x7ffe263bd7e0) at decode.c:254
#13 0x000000000081cbaa in LogicalDecodingProcessRecord
(ctx=ctx@entry=0x29f3de70,
record=<optimized out>) at decode.c:117
#14 0x0000000000823b71 in pg_logical_slot_get_changes_guts
(fcinfo=0x29f2d400,
confirm=confirm@entry=true, binary=binary@entry=true) at
logicalfuncs.c:267
#15 0x0000000000823d13 in pg_logical_slot_get_binary_changes
(fcinfo=<optimized out>)
at logicalfuncs.c:354
#16 0x00000000006b7cb5 in ExecMakeTableFunctionResult (setexpr=0x29f279c8,
econtext=0x29f27818, argContext=<optimized out>,
expectedDesc=0x29f2f348,
randomAccess=false) at execSRF.c:235
#17 0x00000000006ccad7 in FunctionNext (node=0x29f27608) at
nodeFunctionscan.c:95
#18 0x00000000006ac21a in ExecProcNode (node=0x29f27608)
at ../../../src/include/executor/executor.h:327
#19 ExecutePlan (queryDesc=0x29e40780, operation=CMD_SELECT,
sendTuples=true,
numberTuples=0, direction=<optimized out>, dest=0x29f29618) at
execMain.c:1736
#20 standard_ExecutorRun (queryDesc=0x29e40780, direction=<optimized out>,
count=0)
at execMain.c:377
#21 0x00000000008c5f98 in PortalRunSelect (portal=portal@entry=0x29eb7130,
forward=forward@entry=true, count=0, count@entry=9223372036854775807,
dest=dest@entry=0x29f29618) at pquery.c:917
#22 0x00000000008c767e in PortalRun (portal=portal@entry=0x29eb7130,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
dest=dest@entry=0x29f29618, altdest=altdest@entry=0x29f29618,
qc=qc@entry=0x7ffe263bdd50) at pquery.c:761
#23 0x00000000008c3308 in exec_simple_query (
query_string=0x29e13800 "SELECT *\n FROM
pg_logical_slot_get_binary_changes('s_repack', NULL, NULL);") at
postgres.c:1290
#24 0x00000000008c4de1 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4856
#25 0x00000000008beddd in BackendMain (startup_data=<optimized out>,
startup_data_len=<optimized out>) at backend_startup.c:124
#26 0x00000000007fed6e in postmaster_child_launch (child_type=<optimized
out>,
child_slot=1, startup_data=startup_data@entry=0x7ffe263be1a0,
startup_data_len=startup_data_len@entry=24,
client_sock=client_sock@entry=0x7ffe263be1c0) at launch_backend.c:268
#27 0x0000000000802776 in BackendStartup (client_sock=0x7ffe263be1c0)
at postmaster.c:3627
#28 ServerLoop () at postmaster.c:1728
#29 0x0000000000804239 in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x29dbcfe0) at postmaster.c:1415
#30 0x00000000004a1b48 in main (argc=3, argv=0x29dbcfe0) at main.c:231

postgres=# select version();
version
-------------------------------------------------------------------------------------------------------------
PostgreSQL 19devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 16.1.1
20260515 (Red Hat 16.1.1-2), 64-bit
(1 row)

Is this considered normal behavior for the pgrepack plugin, i.e. essentially
a “don’t do that” situation?

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: PG Bug reporting form (#1)
Re: BUG #19500: pgrepack logical decoding plugin can crash assert builds via SQL decoding API

Hi,

On 2026-05-28, PG Bug reporting form wrote:

It appears that the pgrepack output plugin is accessible through the SQL
logical decoding API, even though the plugin code explicitly indicates that
this interface is not supported. Reading changes from such a slot can cause
a backend process crash in builds with asserts enabled.

Is this considered normal behavior for the pgrepack plugin, i.e. essentially
a “don’t do that” situation?

Yeah, I would like to have a way to prevent this, if only for user-friendliness, but it's not terribly pressing since only a role with REPLICATION privs can create the replication slot, which as I recall are already pretty powerful.

--
Álvaro Herrera

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#2)
Re: BUG #19500: pgrepack logical decoding plugin can crash assert builds via SQL decoding API

On 2026-May-29, Álvaro Herrera wrote:

On 2026-05-28, PG Bug reporting form wrote:

It appears that the pgrepack output plugin is accessible through the
SQL logical decoding API, even though the plugin code explicitly
indicates that this interface is not supported. Reading changes from
such a slot can cause a backend process crash in builds with asserts
enabled.

Yeah, I would like to have a way to prevent this, if only for
user-friendliness, but it's not terribly pressing since only a role
with REPLICATION privs can create the replication slot, which as I
recall are already pretty powerful.

How about something like this? It makes your test case throw an error
instead of failing the assertion, which I suppose is an improvement.

The patch is a bit noisy because I moved more code than the minimum
necessary; but the gist of it is that we allocate RepackDecodingState in
repack_startup(), then have repack_setup_logical_decoding() fill in a
magic number, which we later check in repack_begin_txn(). This is a bit
wasteful, because we have to do that check once for each and every
transaction; however I see no other callback that would let us do this
kind of check after the slot is created but before we start to consume
from it.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are." -- Charles J. Sykes' advice to teenagers

Attachments:

0001-Have-RepackDecodingState-carry-a-magic-number.patchtext/x-diff; charset=utf-8Download+60-43
#4Никита Калинин
n.kalinin@postgrespro.ru
In reply to: Alvaro Herrera (#3)
Re: BUG #19500: pgrepack logical decoding plugin can crash assert builds via SQL decoding API

On 2 Jun 2026, at 00:12, Álvaro Herrera <alvherre@kurilemu.de> wrote:

How about something like this? It makes your test case throw an error
instead of failing the assertion, which I suppose is an improvement.

The patch is a bit noisy because I moved more code than the minimum
necessary; but the gist of it is that we allocate RepackDecodingState in
repack_startup(), then have repack_setup_logical_decoding() fill in a
magic number, which we later check in repack_begin_txn(). This is a bit
wasteful, because we have to do that check once for each and every
transaction; however I see no other callback that would let us do this
kind of check after the slot is created but before we start to consume
from it.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are." -- Charles J. Sykes' advice to teenagers
<0001-Have-RepackDecodingState-carry-a-magic-number.patch>

Yes, I agree that returning an error to the user makes sense.

But does the error message need to be that detailed? Perhaps something like

"ERROR: wrong magic number in "pgrepack" decoder plugin"
would be sufficient.

Nevertheless, I tested the patch and can confirm that there are no assertion failures anymore.

I also ran it under ASAN and did not observe any issues.

Would it make sense to add a test for this case from the bug report?

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Никита Калинин (#4)
Re: BUG #19500: pgrepack logical decoding plugin can crash assert builds via SQL decoding API

Hi Nikita,

On 2026-Jun-02, Никита Калинин wrote:

On 2 Jun 2026, at 00:12, Álvaro Herrera <alvherre@kurilemu.de> wrote:

But does the error message need to be that detailed? Perhaps something like

"ERROR: wrong magic number in "pgrepack" decoder plugin"
would be sufficient.

Maybe. Getting 0x00000000 would be quite different from 0x7f7f7f7f for
instance, or a completely random number, so I don't want to judge ahead
of time.

Nevertheless, I tested the patch and can confirm that there are no
assertion failures anymore.

I also ran it under ASAN and did not observe any issues.

Thanks for testing it.

Would it make sense to add a test for this case from the bug report?

Sure, I would do that.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Saca el libro que tu religión considere como el indicado para encontrar la
oración que traiga paz a tu alma. Luego rebootea el computador
y ve si funciona" (Carlos Duclós)