START_REPLICATION SLOT causing a crash in an assert build

Started by Jaime Casanovaover 3 years ago33 messages
#1Jaime Casanova
jcasanov@systemguards.com.ec
1 attachment(s)

Hi,

I'm not sure what is causing this, but I have seen this twice. The
second time without activity after changing the set of tables in a
PUBLICATION.

gdb says that debug_query_string contains:

"""
START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')
"""

attached the backtrace.

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Attachments:

gdb_start_replication_slot_crash.txttext/plain; charset=us-asciiDownload
#2Andres Freund
andres@anarazel.de
In reply to: Jaime Casanova (#1)
Re: START_REPLICATION SLOT causing a crash in an assert build

Hi,

On 2022-09-06 18:40:49 -0500, Jaime Casanova wrote:

I'm not sure what is causing this, but I have seen this twice. The
second time without activity after changing the set of tables in a
PUBLICATION.

Can you describe the steps to reproduce?

Which git commit does this happen on?

gdb says that debug_query_string contains:

"""
START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')
"""

attached the backtrace.

#2 0x00005559bfd4f0ed in ExceptionalCondition (
conditionName=0x5559bff30e20 "namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0", errorType=0x5559bff30e0d "FailedAssertion", fileName=0x5559bff30dbb "pgstat_replslot.c",
lineNumber=89) at assert.c:69

what are statent->slotname and slot->data.name?

Greetings,

Andres Freund

#3Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Andres Freund (#2)
Re: START_REPLICATION SLOT causing a crash in an assert build

On Wed, Sep 07, 2022 at 12:39:08PM -0700, Andres Freund wrote:

Hi,

On 2022-09-06 18:40:49 -0500, Jaime Casanova wrote:

I'm not sure what is causing this, but I have seen this twice. The
second time without activity after changing the set of tables in a
PUBLICATION.

Can you describe the steps to reproduce?

I'm still trying to determine that

Which git commit does this happen on?

6e55ea79faa56db85a2b6c5bf94cee8acf8bfdb8 (Stamp 15beta4)

gdb says that debug_query_string contains:

"""
START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')
"""

attached the backtrace.

#2 0x00005559bfd4f0ed in ExceptionalCondition (
conditionName=0x5559bff30e20 "namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0", errorType=0x5559bff30e0d "FailedAssertion", fileName=0x5559bff30dbb "pgstat_replslot.c",
lineNumber=89) at assert.c:69

what are statent->slotname and slot->data.name?

slot->data.name seems to be the replication_slot record, and
statent->slotname comes from the in shared memory stats for that slot.

And the assert happens when &statent->slotname.data comes empty, which
is not frequent but it happens from time to time

btw, while I'm looking at this I found that we can drop a publication
while there are active subscriptions pointing to it, is that something
we should allow?
anyway, that is not the cause of this because the replication slot actually
exists.

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

#4Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Andres Freund (#2)
1 attachment(s)
Re: START_REPLICATION SLOT causing a crash in an assert build

On Wed, Sep 07, 2022 at 12:39:08PM -0700, Andres Freund wrote:

Hi,

On 2022-09-06 18:40:49 -0500, Jaime Casanova wrote:

I'm not sure what is causing this, but I have seen this twice. The
second time without activity after changing the set of tables in a
PUBLICATION.

This crash happens after a reset of statistics for a slot replication

Can you describe the steps to reproduce?

bin/pg_ctl -D data1 initdb
bin/pg_ctl -D data1 -l logfile1 -o "-c port=54315 -c wal_level=logical" start
bin/psql -p 54315 postgres <<EOF
create table t1 (i int primary key);
create publication pub1 for table t1;
EOF

bin/pg_ctl -D data2 initdb
bin/pg_ctl -D data2 -l logfile2 -o "-c port=54316" start
bin/psql -p 54316 postgres <<EOF
create table t1 (i int primary key);
create subscription sub1 connection 'host=/tmp port=54315 dbname=postgres' publication pub1;
EOF

bin/psql -p 54315 postgres <<EOF
select pg_stat_reset_replication_slot('sub1');
insert into t1 values(1);
EOF

Which git commit does this happen on?

just tested again on f5047c1293acce3c6c3802b06825aa3a9f9aa55a

gdb says that debug_query_string contains:

"""
START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')
"""

attached the backtrace.

#2 0x00005559bfd4f0ed in ExceptionalCondition (
conditionName=0x5559bff30e20 "namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0", errorType=0x5559bff30e0d "FailedAssertion", fileName=0x5559bff30dbb "pgstat_replslot.c",
lineNumber=89) at assert.c:69

what are statent->slotname and slot->data.name?

and the problem seems to be that after zero'ing the stats that includes
the name of the replication slot, this simple patch fixes it... not sure
if it's the right fix though...

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Attachments:

fix_reset_stat_replslot.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/utils/activity/pgstat_replslot.c b/src/backend/utils/activity/pgstat_replslot.c
index b77c05ab5f..db44d51b4c 100644
--- a/src/backend/utils/activity/pgstat_replslot.c
+++ b/src/backend/utils/activity/pgstat_replslot.c
@@ -65,6 +65,7 @@ pgstat_reset_replslot(const char *name)
        /* reset this one entry */
        pgstat_reset(PGSTAT_KIND_REPLSLOT, InvalidOid,
                                 ReplicationSlotIndex(slot));
+       namestrcpy(&slot->data.name, name);
 }
 
 /*

#5Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Jaime Casanova (#4)
1 attachment(s)
Re: START_REPLICATION SLOT causing a crash in an assert build

Nice finding.

At Tue, 13 Sep 2022 00:39:45 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in

and the problem seems to be that after zero'ing the stats that includes
the name of the replication slot, this simple patch fixes it... not sure
if it's the right fix though...

That doesn't work. since what that function clears is not the name in
the slot struct but that in stats entry.

The cause is what pg_stat_reset_replslot wants to do does not match
what pgstat feature thinks as reset.

Unfortunately, we don't have a function to leave a portion alone after
a reset. However, fetching the stats entry in pgstat_reset_replslot is
ugly..

I'm not sure this is less uglier but it works if
pgstat_report_replslot sets the name if it is found to be wiped
out... But that hafly nullify the protction by the assertion just
after.

--- a/src/backend/utils/activity/pgstat_replslot.c
+++ b/src/backend/utils/activity/pgstat_replslot.c
@@ -83,9 +83,11 @@ pgstat_report_replslot(ReplicationSlot *slot, const PgStat_StatReplSlotEntry *re
 	statent = &shstatent->stats;
 	/*
-	 * Any mismatch should have been fixed in pgstat_create_replslot() or
-	 * pgstat_acquire_replslot().
+	 * pgstat_create_replslot() and pgstat_acquire_replslot() enters the name,
+	 * but pgstat_reset_replslot() may clear it.
 	 */
+	if (statent->slotname.data[0] == 0)
+		namestrcpy(&shstatent->stats.slotname, NameStr(slot->data.name));
 	Assert(namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0);

Another measure would be to add the region to wipe-out on reset to
PgStat_KindInfo, but it seems too much.. (attached)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

replslot_reset_fix.txttext/plain; charset=us-asciiDownload
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 6224c498c2..ab88ebbc64 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -263,6 +263,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 		.shared_size = sizeof(PgStatShared_Database),
 		.shared_data_off = offsetof(PgStatShared_Database, stats),
 		.shared_data_len = sizeof(((PgStatShared_Database *) 0)->stats),
+		.reset_off = offsetof(PgStatShared_Database, stats),
+		.reset_len = sizeof(((PgStatShared_Database *) 0)->stats),
 		.pending_size = sizeof(PgStat_StatDBEntry),
 
 		.flush_pending_cb = pgstat_database_flush_cb,
@@ -277,6 +279,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 		.shared_size = sizeof(PgStatShared_Relation),
 		.shared_data_off = offsetof(PgStatShared_Relation, stats),
 		.shared_data_len = sizeof(((PgStatShared_Relation *) 0)->stats),
+		.reset_off = offsetof(PgStatShared_Relation, stats),
+		.reset_len = sizeof(((PgStatShared_Relation *) 0)->stats),
 		.pending_size = sizeof(PgStat_TableStatus),
 
 		.flush_pending_cb = pgstat_relation_flush_cb,
@@ -291,6 +295,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 		.shared_size = sizeof(PgStatShared_Function),
 		.shared_data_off = offsetof(PgStatShared_Function, stats),
 		.shared_data_len = sizeof(((PgStatShared_Function *) 0)->stats),
+		.reset_off = offsetof(PgStatShared_Function, stats),
+		.reset_len = sizeof(((PgStatShared_Function *) 0)->stats),
 		.pending_size = sizeof(PgStat_BackendFunctionEntry),
 
 		.flush_pending_cb = pgstat_function_flush_cb,
@@ -307,6 +313,9 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 		.shared_size = sizeof(PgStatShared_ReplSlot),
 		.shared_data_off = offsetof(PgStatShared_ReplSlot, stats),
 		.shared_data_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats),
+		.reset_off = offsetof(PgStatShared_ReplSlot, stats.spill_txns),
+		.reset_len = sizeof(PgStatShared_ReplSlot) -
+		offsetof(PgStatShared_ReplSlot, stats.spill_txns),
 
 		.reset_timestamp_cb = pgstat_replslot_reset_timestamp_cb,
 		.to_serialized_name = pgstat_replslot_to_serialized_name_cb,
@@ -323,6 +332,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 		.shared_size = sizeof(PgStatShared_Subscription),
 		.shared_data_off = offsetof(PgStatShared_Subscription, stats),
 		.shared_data_len = sizeof(((PgStatShared_Subscription *) 0)->stats),
+		.reset_off = offsetof(PgStatShared_Subscription, stats),
+		.reset_len = sizeof(((PgStatShared_Subscription *) 0)->stats),
 		.pending_size = sizeof(PgStat_BackendSubEntry),
 
 		.flush_pending_cb = pgstat_subscription_flush_cb,
diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c
index ac98918688..09a8c3873c 100644
--- a/src/backend/utils/activity/pgstat_shmem.c
+++ b/src/backend/utils/activity/pgstat_shmem.c
@@ -915,8 +915,9 @@ shared_stat_reset_contents(PgStat_Kind kind, PgStatShared_Common *header,
 {
 	const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
 
-	memset(pgstat_get_entry_data(kind, header), 0,
-		   pgstat_get_entry_len(kind));
+	memset((char *)pgstat_get_entry_data(kind, header) +
+		   kind_info->reset_off, 0,
+		   kind_info->reset_len);
 
 	if (kind_info->reset_timestamp_cb)
 		kind_info->reset_timestamp_cb(header, ts);
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 901d2041d6..3715a48776 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -215,6 +215,12 @@ typedef struct PgStat_KindInfo
 	uint32		shared_data_off;
 	uint32		shared_data_len;
 
+	/*
+	 * The offset/size of the region to wipe out when the entry is reset.
+	 */
+	uint32		reset_off;
+	uint32		reset_len;
+
 	/*
 	 * The size of the pending data for this kind. E.g. how large
 	 * PgStat_EntryRef->pending is. Used for allocations.
#6Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Kyotaro Horiguchi (#5)
Re: START_REPLICATION SLOT causing a crash in an assert build

On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote:

Nice finding.

At Tue, 13 Sep 2022 00:39:45 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in

and the problem seems to be that after zero'ing the stats that includes
the name of the replication slot, this simple patch fixes it... not sure
if it's the right fix though...

That doesn't work. since what that function clears is not the name in
the slot struct but that in stats entry.

you're right... the curious thing is that I tested it and it worked, but
now it doesn't... maybe it was too late for me...

The cause is what pg_stat_reset_replslot wants to do does not match
what pgstat feature thinks as reset.

[...]

Another measure would be to add the region to wipe-out on reset to
PgStat_KindInfo, but it seems too much.. (attached)

This patch solves the problem, i didn't like the other solution because
as you say it partly nullify the protection of the assertion.

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

#7Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Jaime Casanova (#6)
Re: START_REPLICATION SLOT causing a crash in an assert build

On Tue, Sep 13, 2022 at 10:07:50PM -0500, Jaime Casanova wrote:

On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote:

Another measure would be to add the region to wipe-out on reset to
PgStat_KindInfo, but it seems too much.. (attached)

This patch solves the problem, i didn't like the other solution because
as you say it partly nullify the protection of the assertion.

I talked too fast, while it solves the immediate problem the patch as is
causes other crashes.

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

#8Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Jaime Casanova (#7)
Re: START_REPLICATION SLOT causing a crash in an assert build

At Thu, 15 Sep 2022 01:26:15 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in

On Tue, Sep 13, 2022 at 10:07:50PM -0500, Jaime Casanova wrote:

On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote:

Another measure would be to add the region to wipe-out on reset to
PgStat_KindInfo, but it seems too much.. (attached)

This patch solves the problem, i didn't like the other solution because
as you say it partly nullify the protection of the assertion.

I talked too fast, while it solves the immediate problem the patch as is
causes other crashes.

Where did the crash happen? Is it a bug introduced by it? Or does it
root to other cause?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#9Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Kyotaro Horiguchi (#8)
1 attachment(s)
Re: START_REPLICATION SLOT causing a crash in an assert build

On Thu, Sep 15, 2022 at 05:30:11PM +0900, Kyotaro Horiguchi wrote:

At Thu, 15 Sep 2022 01:26:15 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in

On Tue, Sep 13, 2022 at 10:07:50PM -0500, Jaime Casanova wrote:

On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote:

Another measure would be to add the region to wipe-out on reset to
PgStat_KindInfo, but it seems too much.. (attached)

This patch solves the problem, i didn't like the other solution because
as you say it partly nullify the protection of the assertion.

I talked too fast, while it solves the immediate problem the patch as is
causes other crashes.

Where did the crash happen? Is it a bug introduced by it? Or does it
root to other cause?

Just compile and run the installcheck tests.

It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside
pgstat_release_entry_ref() because it expects a "deadbeef", it seems to
be a magic variable but cannot find what its use is.

Attached a backtrace.

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Attachments:

gdb.txttext/plain; charset=us-asciiDownload
#10Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Jaime Casanova (#9)
1 attachment(s)
Re: START_REPLICATION SLOT causing a crash in an assert build

At Thu, 15 Sep 2022 11:15:12 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in

It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside

Thanks for the info. I reproduced by make check.. stupid..

It's the thinko about the base address of reset_off.

So the attached doesn't crash..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

replslot_reset_fix_2.txttext/plain; charset=us-asciiDownload
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 6224c498c2..ed3f3af4d9 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -263,6 +263,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 		.shared_size = sizeof(PgStatShared_Database),
 		.shared_data_off = offsetof(PgStatShared_Database, stats),
 		.shared_data_len = sizeof(((PgStatShared_Database *) 0)->stats),
+		.reset_off = 0,
+		.reset_len = sizeof(((PgStatShared_Database *) 0)->stats),
 		.pending_size = sizeof(PgStat_StatDBEntry),
 
 		.flush_pending_cb = pgstat_database_flush_cb,
@@ -277,6 +279,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 		.shared_size = sizeof(PgStatShared_Relation),
 		.shared_data_off = offsetof(PgStatShared_Relation, stats),
 		.shared_data_len = sizeof(((PgStatShared_Relation *) 0)->stats),
+		.reset_off = 0,
+		.reset_len = sizeof(((PgStatShared_Relation *) 0)->stats),
 		.pending_size = sizeof(PgStat_TableStatus),
 
 		.flush_pending_cb = pgstat_relation_flush_cb,
@@ -291,6 +295,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 		.shared_size = sizeof(PgStatShared_Function),
 		.shared_data_off = offsetof(PgStatShared_Function, stats),
 		.shared_data_len = sizeof(((PgStatShared_Function *) 0)->stats),
+		.reset_off = 0,
+		.reset_len = sizeof(((PgStatShared_Function *) 0)->stats),
 		.pending_size = sizeof(PgStat_BackendFunctionEntry),
 
 		.flush_pending_cb = pgstat_function_flush_cb,
@@ -307,6 +313,10 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 		.shared_size = sizeof(PgStatShared_ReplSlot),
 		.shared_data_off = offsetof(PgStatShared_ReplSlot, stats),
 		.shared_data_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats),
+		/* reset doesn't wipe off slot name */
+		.reset_off = offsetof(PgStat_StatReplSlotEntry, spill_txns),
+		.reset_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats),
+		offsetof(PgStat_StatReplSlotEntry, spill_txns),
 
 		.reset_timestamp_cb = pgstat_replslot_reset_timestamp_cb,
 		.to_serialized_name = pgstat_replslot_to_serialized_name_cb,
@@ -323,6 +333,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 		.shared_size = sizeof(PgStatShared_Subscription),
 		.shared_data_off = offsetof(PgStatShared_Subscription, stats),
 		.shared_data_len = sizeof(((PgStatShared_Subscription *) 0)->stats),
+		.reset_off = 0,
+		.reset_len = sizeof(((PgStatShared_Subscription *) 0)->stats),
 		.pending_size = sizeof(PgStat_BackendSubEntry),
 
 		.flush_pending_cb = pgstat_subscription_flush_cb,
diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c
index ac98918688..09a8c3873c 100644
--- a/src/backend/utils/activity/pgstat_shmem.c
+++ b/src/backend/utils/activity/pgstat_shmem.c
@@ -915,8 +915,9 @@ shared_stat_reset_contents(PgStat_Kind kind, PgStatShared_Common *header,
 {
 	const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
 
-	memset(pgstat_get_entry_data(kind, header), 0,
-		   pgstat_get_entry_len(kind));
+	memset((char *)pgstat_get_entry_data(kind, header) +
+		   kind_info->reset_off, 0,
+		   kind_info->reset_len);
 
 	if (kind_info->reset_timestamp_cb)
 		kind_info->reset_timestamp_cb(header, ts);
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 901d2041d6..144fe5814c 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -215,6 +215,13 @@ typedef struct PgStat_KindInfo
 	uint32		shared_data_off;
 	uint32		shared_data_len;
 
+	/*
+	 * The offset/size of the region to wipe out when the entry is reset.
+	 * reset_off is relative to shared_data_off.
+	 */
+	uint32		reset_off;
+	uint32		reset_len;
+
 	/*
 	 * The size of the pending data for this kind. E.g. how large
 	 * PgStat_EntryRef->pending is. Used for allocations.
#11Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Kyotaro Horiguchi (#10)
Re: START_REPLICATION SLOT causing a crash in an assert build

On Fri, Sep 16, 2022 at 02:37:17PM +0900, Kyotaro Horiguchi wrote:

At Thu, 15 Sep 2022 11:15:12 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in

It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside

Thanks for the info. I reproduced by make check.. stupid..

It's the thinko about the base address of reset_off.

So the attached doesn't crash..

Hi,

Just confirming there have been no crash since this last patch.

--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

#12Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Jaime Casanova (#11)
Re: START_REPLICATION SLOT causing a crash in an assert build

At Mon, 19 Sep 2022 11:04:03 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in

On Fri, Sep 16, 2022 at 02:37:17PM +0900, Kyotaro Horiguchi wrote:

At Thu, 15 Sep 2022 11:15:12 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in

It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside

Thanks for the info. I reproduced by make check.. stupid..

It's the thinko about the base address of reset_off.

So the attached doesn't crash..

Hi,

Just confirming there have been no crash since this last patch.

Thanks for confirmation.

Althouh I'm not sure whether this is the right direction, this seems
to be an open item of 15?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#13Andres Freund
andres@anarazel.de
In reply to: Kyotaro Horiguchi (#10)
Re: START_REPLICATION SLOT causing a crash in an assert build

Hi,

I wonder if the correct fix here wouldn't be to move the slotname out of
PgStat_StatReplSlotEntry?

On 2022-09-16 14:37:17 +0900, Kyotaro Horiguchi wrote:

diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 6224c498c2..ed3f3af4d9 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -263,6 +263,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
.shared_size = sizeof(PgStatShared_Database),
.shared_data_off = offsetof(PgStatShared_Database, stats),
.shared_data_len = sizeof(((PgStatShared_Database *) 0)->stats),
+		.reset_off = 0,
+		.reset_len = sizeof(((PgStatShared_Database *) 0)->stats),
.pending_size = sizeof(PgStat_StatDBEntry),
.flush_pending_cb = pgstat_database_flush_cb,
@@ -277,6 +279,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
.shared_size = sizeof(PgStatShared_Relation),
.shared_data_off = offsetof(PgStatShared_Relation, stats),
.shared_data_len = sizeof(((PgStatShared_Relation *) 0)->stats),
+		.reset_off = 0,
+		.reset_len = sizeof(((PgStatShared_Relation *) 0)->stats),
.pending_size = sizeof(PgStat_TableStatus),
.flush_pending_cb = pgstat_relation_flush_cb,
@@ -291,6 +295,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
.shared_size = sizeof(PgStatShared_Function),
.shared_data_off = offsetof(PgStatShared_Function, stats),
.shared_data_len = sizeof(((PgStatShared_Function *) 0)->stats),
+		.reset_off = 0,
+		.reset_len = sizeof(((PgStatShared_Function *) 0)->stats),
.pending_size = sizeof(PgStat_BackendFunctionEntry),
.flush_pending_cb = pgstat_function_flush_cb,
@@ -307,6 +313,10 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
.shared_size = sizeof(PgStatShared_ReplSlot),
.shared_data_off = offsetof(PgStatShared_ReplSlot, stats),
.shared_data_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats),
+		/* reset doesn't wipe off slot name */
+		.reset_off = offsetof(PgStat_StatReplSlotEntry, spill_txns),
+		.reset_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats),
+		offsetof(PgStat_StatReplSlotEntry, spill_txns),

I'm confused what this offsetof does here? It's not even assigned to a
specific field? Am I missing something?

Also, wouldn't we need to subtract something of the size?

diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c
index ac98918688..09a8c3873c 100644
--- a/src/backend/utils/activity/pgstat_shmem.c
+++ b/src/backend/utils/activity/pgstat_shmem.c
@@ -915,8 +915,9 @@ shared_stat_reset_contents(PgStat_Kind kind, PgStatShared_Common *header,
{
const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
-	memset(pgstat_get_entry_data(kind, header), 0,
-		   pgstat_get_entry_len(kind));
+	memset((char *)pgstat_get_entry_data(kind, header) +
+		   kind_info->reset_off, 0,
+		   kind_info->reset_len);

if (kind_info->reset_timestamp_cb)
kind_info->reset_timestamp_cb(header, ts);

This likely doesn't quite conform to what pgindent wants...

Greetings,

Andres Freund

#14Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andres Freund (#13)
1 attachment(s)
Re: START_REPLICATION SLOT causing a crash in an assert build

Thanks!

At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in

I wonder if the correct fix here wouldn't be to move the slotname out of
PgStat_StatReplSlotEntry?

Ugh. Right. I thought its outer struct as purely the part for the
common header. But we can freely place anything after the header
part. I moved it to the outer struct. I didn't clear that part in
pgstat_create_relation() because it is filled in immediately.

The attached is that.

On 2022-09-16 14:37:17 +0900, Kyotaro Horiguchi wrote:

...

@@ -307,6 +313,10 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
.shared_size = sizeof(PgStatShared_ReplSlot),
.shared_data_off = offsetof(PgStatShared_ReplSlot, stats),
.shared_data_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats),
+		/* reset doesn't wipe off slot name */
+		.reset_off = offsetof(PgStat_StatReplSlotEntry, spill_txns),
+		.reset_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats),
+		offsetof(PgStat_StatReplSlotEntry, spill_txns),

I'm confused what this offsetof does here? It's not even assigned to a
specific field? Am I missing something?

Also, wouldn't we need to subtract something of the size?

Yeah, I felt it confusing. The last line above is offset from just
after the header part (it is PgStat_, not PgStatShared_). I first
wrote that as you suggested but rewrote to shorter representation.

diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c
index ac98918688..09a8c3873c 100644
--- a/src/backend/utils/activity/pgstat_shmem.c
+++ b/src/backend/utils/activity/pgstat_shmem.c
@@ -915,8 +915,9 @@ shared_stat_reset_contents(PgStat_Kind kind, PgStatShared_Common *header,
{
const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
-	memset(pgstat_get_entry_data(kind, header), 0,
-		   pgstat_get_entry_len(kind));
+	memset((char *)pgstat_get_entry_data(kind, header) +
+		   kind_info->reset_off, 0,
+		   kind_info->reset_len);

if (kind_info->reset_timestamp_cb)
kind_info->reset_timestamp_cb(header, ts);

This likely doesn't quite conform to what pgindent wants...

In the first place, it's ugly...

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v1-0001-Do-not-reset-slot-name-of-replication-slot-stats.patchtext/x-patch; charset=us-asciiDownload
From 5789b35c1eca77ad63066e6671921e1e8a656372 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Date: Tue, 27 Sep 2022 14:28:56 +0900
Subject: [PATCH v1] Do not reset slot name of replication slot stats

pg_stat_reset_replication_slot() clears counters involving slot name,
which is not intended and causes crash.  Move slot name out of
PgStat_StatReplSlotEntry to PgStatShared_Database so that the name
won't be wiped out by a counter reset.
---
 src/backend/utils/activity/pgstat_replslot.c | 12 ++++++------
 src/include/pgstat.h                         |  1 -
 src/include/utils/pgstat_internal.h          |  1 +
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/backend/utils/activity/pgstat_replslot.c b/src/backend/utils/activity/pgstat_replslot.c
index 2039ac3147..0a383f9f32 100644
--- a/src/backend/utils/activity/pgstat_replslot.c
+++ b/src/backend/utils/activity/pgstat_replslot.c
@@ -86,7 +86,7 @@ pgstat_report_replslot(ReplicationSlot *slot, const PgStat_StatReplSlotEntry *re
 	 * Any mismatch should have been fixed in pgstat_create_replslot() or
 	 * pgstat_acquire_replslot().
 	 */
-	Assert(namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0);
+	Assert(namestrcmp(&shstatent->slotname, NameStr(slot->data.name)) == 0);
 
 	/* Update the replication slot statistics */
 #define REPLSLOT_ACC(fld) statent->fld += repSlotStat->fld
@@ -124,7 +124,7 @@ pgstat_create_replslot(ReplicationSlot *slot)
 	 * if we previously crashed after dropping a slot.
 	 */
 	memset(&shstatent->stats, 0, sizeof(shstatent->stats));
-	namestrcpy(&shstatent->stats.slotname, NameStr(slot->data.name));
+	namestrcpy(&shstatent->slotname, NameStr(slot->data.name));
 
 	pgstat_unlock_entry(entry_ref);
 }
@@ -148,11 +148,11 @@ pgstat_acquire_replslot(ReplicationSlot *slot)
 	 * NB: need to accept that there might be stats from an older slot, e.g.
 	 * if we previously crashed after dropping a slot.
 	 */
-	if (NameStr(statent->slotname)[0] == 0 ||
-		namestrcmp(&statent->slotname, NameStr(slot->data.name)) != 0)
+	if (NameStr(shstatent->slotname)[0] == 0 ||
+		namestrcmp(&shstatent->slotname, NameStr(slot->data.name)) != 0)
 	{
 		memset(statent, 0, sizeof(*statent));
-		namestrcpy(&statent->slotname, NameStr(slot->data.name));
+		namestrcpy(&shstatent->slotname, NameStr(slot->data.name));
 	}
 
 	pgstat_unlock_entry(entry_ref);
@@ -187,7 +187,7 @@ pgstat_fetch_replslot(NameData slotname)
 void
 pgstat_replslot_to_serialized_name_cb(const PgStatShared_Common *header, NameData *name)
 {
-	namestrcpy(name, NameStr(((PgStatShared_ReplSlot *) header)->stats.slotname));
+	namestrcpy(name, NameStr(((PgStatShared_ReplSlot *) header)->slotname));
 }
 
 bool
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ad7334a0d2..530d49e5c3 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -321,7 +321,6 @@ typedef struct PgStat_StatFuncEntry
 
 typedef struct PgStat_StatReplSlotEntry
 {
-	NameData	slotname;
 	PgStat_Counter spill_txns;
 	PgStat_Counter spill_count;
 	PgStat_Counter spill_bytes;
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 40a3602855..9bdcc08ed3 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -380,6 +380,7 @@ typedef struct PgStatShared_Subscription
 typedef struct PgStatShared_ReplSlot
 {
 	PgStatShared_Common header;
+	NameData	slotname;
 	PgStat_StatReplSlotEntry stats;
 } PgStatShared_ReplSlot;
 
-- 
2.31.1

#15Jonathan S. Katz
jkatz@postgresql.org
In reply to: Kyotaro Horiguchi (#14)
Re: START_REPLICATION SLOT causing a crash in an assert build

On 9/27/22 1:52 AM, Kyotaro Horiguchi wrote:

Thanks!

At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in

I wonder if the correct fix here wouldn't be to move the slotname out of
PgStat_StatReplSlotEntry?

Ugh. Right. I thought its outer struct as purely the part for the
common header. But we can freely place anything after the header
part. I moved it to the outer struct. I didn't clear that part in
pgstat_create_relation() because it is filled in immediately.

The attached is that.

This is still listed as an open item[1]https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items for v15. Does this fix proposed
address the issue?

Thanks,

Jonathan

[1]: https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items

#16Andres Freund
andres@anarazel.de
In reply to: Jonathan S. Katz (#15)
Re: START_REPLICATION SLOT causing a crash in an assert build

Hi,

On 2022-10-05 13:00:53 -0400, Jonathan S. Katz wrote:

On 9/27/22 1:52 AM, Kyotaro Horiguchi wrote:

Thanks!

At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in

I wonder if the correct fix here wouldn't be to move the slotname out of
PgStat_StatReplSlotEntry?

Ugh. Right. I thought its outer struct as purely the part for the
common header. But we can freely place anything after the header
part. I moved it to the outer struct. I didn't clear that part in
pgstat_create_relation() because it is filled in immediately.

The attached is that.

This is still listed as an open item[1] for v15. Does this fix proposed
address the issue?

Unfortunately not - it doesn't even pass the existing tests
(test_decoding/001_repl_stats fails) :(.

The reason for that is that with the patch nothing restores the slotname when
reading stats from disk. That turns out not to cause immediate issues, but at
the next shutdown the name won't be set, and we'll serialize the stats data
with an empty string as the name.

I have two ideas how to fix it. As a design constraint, I'd be interested in
the RMTs opinion on the following:
Is a cleaner fix that changes the stats format (i.e. existing stats will be
thrown away when upgrading) or one that doesn't change the stats format
preferrable?

Greetings,

Andres Freund

#17Jonathan S. Katz
jkatz@postgresql.org
In reply to: Andres Freund (#16)
Re: START_REPLICATION SLOT causing a crash in an assert build

On 10/5/22 8:44 PM, Andres Freund wrote:

Hi,

On 2022-10-05 13:00:53 -0400, Jonathan S. Katz wrote:

On 9/27/22 1:52 AM, Kyotaro Horiguchi wrote:

Thanks!

At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in

I wonder if the correct fix here wouldn't be to move the slotname out of
PgStat_StatReplSlotEntry?

Ugh. Right. I thought its outer struct as purely the part for the
common header. But we can freely place anything after the header
part. I moved it to the outer struct. I didn't clear that part in
pgstat_create_relation() because it is filled in immediately.

The attached is that.

This is still listed as an open item[1] for v15. Does this fix proposed
address the issue?

Unfortunately not - it doesn't even pass the existing tests
(test_decoding/001_repl_stats fails) :(.

Thanks for checking.

The reason for that is that with the patch nothing restores the slotname when
reading stats from disk. That turns out not to cause immediate issues, but at
the next shutdown the name won't be set, and we'll serialize the stats data
with an empty string as the name.

Ah.

I have two ideas how to fix it. As a design constraint, I'd be interested in
the RMTs opinion on the following:
Is a cleaner fix that changes the stats format (i.e. existing stats will be
thrown away when upgrading) or one that doesn't change the stats format
preferrable?

[My opinion, will let Michael/John chime in]

Ideally we would keep the stats on upgrade -- I think that's the better
user experience.

Thanks,

Jonathan

#18Michael Paquier
michael@paquier.xyz
In reply to: Jonathan S. Katz (#17)
Re: START_REPLICATION SLOT causing a crash in an assert build

On Wed, Oct 05, 2022 at 11:24:57PM -0400, Jonathan S. Katz wrote:

On 10/5/22 8:44 PM, Andres Freund wrote:

I have two ideas how to fix it. As a design constraint, I'd be interested in
the RMTs opinion on the following:
Is a cleaner fix that changes the stats format (i.e. existing stats will be
thrown away when upgrading) or one that doesn't change the stats format
preferrable?

[My opinion, will let Michael/John chime in]

Ideally we would keep the stats on upgrade -- I think that's the better user
experience.

The release has not happened yet, so I would be fine to remain
flexible and bump again PGSTAT_FILE_FORMAT_ID so as we have the
cleanest approach in place for the release and the future. At the
end, we would throw automatically the data of a file that's marked
with a version that does not match with what we expect at load time,
so there's a limited impact on the user experience, except, well,
losing these stats, of course.
--
Michael

#19Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#18)
1 attachment(s)
Re: START_REPLICATION SLOT causing a crash in an assert build

At Thu, 6 Oct 2022 13:44:43 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Wed, Oct 05, 2022 at 11:24:57PM -0400, Jonathan S. Katz wrote:

On 10/5/22 8:44 PM, Andres Freund wrote:

I have two ideas how to fix it. As a design constraint, I'd be interested in
the RMTs opinion on the following:
Is a cleaner fix that changes the stats format (i.e. existing stats will be
thrown away when upgrading) or one that doesn't change the stats format
preferrable?

[My opinion, will let Michael/John chime in]

Ideally we would keep the stats on upgrade -- I think that's the better user
experience.

The release has not happened yet, so I would be fine to remain
flexible and bump again PGSTAT_FILE_FORMAT_ID so as we have the
cleanest approach in place for the release and the future. At the
end, we would throw automatically the data of a file that's marked
with a version that does not match with what we expect at load time,
so there's a limited impact on the user experience, except, well,
losing these stats, of course.

+1. FWIW, the atttached is an example of what it looks like if we
avoid file format change.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v2-0001-If-we-avoid-file-format.patchtext/x-patch; charset=us-asciiDownload
From 4effe9279d6535191d7f5478bb76490dc8762d33 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Date: Thu, 6 Oct 2022 14:06:04 +0900
Subject: [PATCH v2] If we avoid file format

---
 src/backend/utils/activity/pgstat.c          |  9 +++++++--
 src/backend/utils/activity/pgstat_replslot.c | 18 ++++++++++++------
 src/include/pgstat.h                         |  1 -
 src/include/utils/pgstat_internal.h          |  3 +++
 4 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 609f0b1ad8..a03184c9ef 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -310,6 +310,7 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
 
 		.reset_timestamp_cb = pgstat_replslot_reset_timestamp_cb,
 		.to_serialized_name = pgstat_replslot_to_serialized_name_cb,
+		.set_name = pgstat_replslot_set_name_cb,
 		.from_serialized_name = pgstat_replslot_from_serialized_name_cb,
 	},
 
@@ -1523,6 +1524,8 @@ pgstat_read_statsfile(void)
 					PgStat_HashKey key;
 					PgStatShared_HashEntry *p;
 					PgStatShared_Common *header;
+					const PgStat_KindInfo *kind_info = NULL;
+					NameData	name = {0};
 
 					CHECK_FOR_INTERRUPTS();
 
@@ -1538,9 +1541,7 @@ pgstat_read_statsfile(void)
 					else
 					{
 						/* stats entry identified by name on disk (e.g. slots) */
-						const PgStat_KindInfo *kind_info = NULL;
 						PgStat_Kind kind;
-						NameData	name;
 
 						if (!read_chunk_s(fpin, &kind))
 							goto error;
@@ -1590,6 +1591,10 @@ pgstat_read_statsfile(void)
 									pgstat_get_entry_len(key.kind)))
 						goto error;
 
+					/* set name if it requires to do that separately */
+					if (kind_info && kind_info->set_name && NameStr(name)[0])
+						kind_info->set_name(header, &name);
+						
 					break;
 				}
 			case 'E':
diff --git a/src/backend/utils/activity/pgstat_replslot.c b/src/backend/utils/activity/pgstat_replslot.c
index 2039ac3147..9fa4df610d 100644
--- a/src/backend/utils/activity/pgstat_replslot.c
+++ b/src/backend/utils/activity/pgstat_replslot.c
@@ -86,7 +86,7 @@ pgstat_report_replslot(ReplicationSlot *slot, const PgStat_StatReplSlotEntry *re
 	 * Any mismatch should have been fixed in pgstat_create_replslot() or
 	 * pgstat_acquire_replslot().
 	 */
-	Assert(namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0);
+	Assert(namestrcmp(&shstatent->slotname, NameStr(slot->data.name)) == 0);
 
 	/* Update the replication slot statistics */
 #define REPLSLOT_ACC(fld) statent->fld += repSlotStat->fld
@@ -124,7 +124,7 @@ pgstat_create_replslot(ReplicationSlot *slot)
 	 * if we previously crashed after dropping a slot.
 	 */
 	memset(&shstatent->stats, 0, sizeof(shstatent->stats));
-	namestrcpy(&shstatent->stats.slotname, NameStr(slot->data.name));
+	namestrcpy(&shstatent->slotname, NameStr(slot->data.name));
 
 	pgstat_unlock_entry(entry_ref);
 }
@@ -148,11 +148,11 @@ pgstat_acquire_replslot(ReplicationSlot *slot)
 	 * NB: need to accept that there might be stats from an older slot, e.g.
 	 * if we previously crashed after dropping a slot.
 	 */
-	if (NameStr(statent->slotname)[0] == 0 ||
-		namestrcmp(&statent->slotname, NameStr(slot->data.name)) != 0)
+	if (NameStr(shstatent->slotname)[0] == 0 ||
+		namestrcmp(&shstatent->slotname, NameStr(slot->data.name)) != 0)
 	{
 		memset(statent, 0, sizeof(*statent));
-		namestrcpy(&statent->slotname, NameStr(slot->data.name));
+		namestrcpy(&shstatent->slotname, NameStr(slot->data.name));
 	}
 
 	pgstat_unlock_entry(entry_ref);
@@ -187,7 +187,13 @@ pgstat_fetch_replslot(NameData slotname)
 void
 pgstat_replslot_to_serialized_name_cb(const PgStatShared_Common *header, NameData *name)
 {
-	namestrcpy(name, NameStr(((PgStatShared_ReplSlot *) header)->stats.slotname));
+	namestrcpy(name, NameStr(((PgStatShared_ReplSlot *) header)->slotname));
+}
+
+void
+pgstat_replslot_set_name_cb(const PgStatShared_Common *header, NameData *name)
+{
+	namestrcpy(&((PgStatShared_ReplSlot *) header)->slotname, NameStr(*name));
 }
 
 bool
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ad7334a0d2..530d49e5c3 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -321,7 +321,6 @@ typedef struct PgStat_StatFuncEntry
 
 typedef struct PgStat_StatReplSlotEntry
 {
-	NameData	slotname;
 	PgStat_Counter spill_txns;
 	PgStat_Counter spill_count;
 	PgStat_Counter spill_bytes;
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 40a3602855..ff07247062 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -244,6 +244,7 @@ typedef struct PgStat_KindInfo
 	 */
 	void		(*to_serialized_name) (const PgStatShared_Common *header, NameData *name);
 	bool		(*from_serialized_name) (const NameData *name, PgStat_HashKey *key);
+	void		(*set_name) (const PgStatShared_Common *header, NameData *name);
 
 	/*
 	 * For fixed-numbered statistics: Reset All.
@@ -380,6 +381,7 @@ typedef struct PgStatShared_Subscription
 typedef struct PgStatShared_ReplSlot
 {
 	PgStatShared_Common header;
+	NameData	slotname;
 	PgStat_StatReplSlotEntry stats;
 } PgStatShared_ReplSlot;
 
@@ -568,6 +570,7 @@ extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
 
 extern void pgstat_replslot_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
 extern void pgstat_replslot_to_serialized_name_cb(const PgStatShared_Common *header, NameData *name);
+extern void pgstat_replslot_set_name_cb(const PgStatShared_Common *header, NameData *name);
 extern bool pgstat_replslot_from_serialized_name_cb(const NameData *name, PgStat_HashKey *key);
 
 
-- 
2.31.1

#20Jonathan S. Katz
jkatz@postgresql.org
In reply to: Kyotaro Horiguchi (#19)
Re: START_REPLICATION SLOT causing a crash in an assert build

On 10/6/22 1:10 AM, Kyotaro Horiguchi wrote:

At Thu, 6 Oct 2022 13:44:43 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Wed, Oct 05, 2022 at 11:24:57PM -0400, Jonathan S. Katz wrote:

On 10/5/22 8:44 PM, Andres Freund wrote:

I have two ideas how to fix it. As a design constraint, I'd be interested in
the RMTs opinion on the following:
Is a cleaner fix that changes the stats format (i.e. existing stats will be
thrown away when upgrading) or one that doesn't change the stats format
preferrable?

[My opinion, will let Michael/John chime in]

Ideally we would keep the stats on upgrade -- I think that's the better user
experience.

The release has not happened yet, so I would be fine to remain
flexible and bump again PGSTAT_FILE_FORMAT_ID so as we have the
cleanest approach in place for the release and the future.

Yes, I agree with this.

At the

end, we would throw automatically the data of a file that's marked
with a version that does not match with what we expect at load time,
so there's a limited impact on the user experience, except, well,
losing these stats, of course.

I'm fine with this.

+1. FWIW, the atttached is an example of what it looks like if we
avoid file format change.

Thanks for the quick turnaround. I'll let others chime in on the review.

Thanks,

Jonathan

#21Andres Freund
andres@anarazel.de
In reply to: Kyotaro Horiguchi (#19)
1 attachment(s)
Re: START_REPLICATION SLOT causing a crash in an assert build

Hi,

On 2022-10-06 14:10:46 +0900, Kyotaro Horiguchi wrote:

+1. FWIW, the atttached is an example of what it looks like if we
avoid file format change.

What about if we go the other direction - simply remove the name from the
stats entry at all. I don't actually think we need it anymore. Unless I am
missing something right now - entirely possible! - the danger that
pgstat_acquire_replslot() mentions doesn't actually exist [anymore]. After a
crash we throw away the old stats data and if a slot is dropped while shut
down, we'll not load the slot data at startup.

The attached is a rough prototype, but should be enough for Jaime to test and
Horiguchi to test / review / think about.

Amit, I CCed you, since you've thought a bunch about the dangers in this area
too.

Greetings,

Andres Freund

Attachments:

v3-0001-pgstat-Remove-slotname-from-PgStat_StatReplSlotEn.patchtext/x-diff; charset=us-asciiDownload
From f34c6ed510fbe4c079a0c80e9040cac77c60fd19 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Thu, 6 Oct 2022 15:50:36 -0700
Subject: [PATCH v3] pgstat: Remove slotname from PgStat_StatReplSlotEntry

This fixes a bug where the slotname is reset as part of
pgstat_reset_replslot() subsequently causing an allocation failure in
pgstat_report_replslot(). It also is architecturally nicer, because having the
name as part of the stats isn't quite right, on account of not actually being
stats :)

As far as I can tell we actually have worked around all the dangers that lead
us to keeping the stats name part of the stats.

Author:
Reviewed-by:
Discussion: https://postgr.es/m/
Backpatch:
---
 src/include/pgstat.h                         |  2 ++
 src/include/utils/pgstat_internal.h          |  5 +--
 src/backend/utils/activity/pgstat.c          |  2 +-
 src/backend/utils/activity/pgstat_replslot.c | 36 +++++---------------
 4 files changed, 14 insertions(+), 31 deletions(-)

diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ad7334a0d21..2a85fa0369a 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -321,7 +321,9 @@ typedef struct PgStat_StatFuncEntry
 
 typedef struct PgStat_StatReplSlotEntry
 {
+#if IN_PG_15_ONLY_UNUSED
 	NameData	slotname;
+#endif
 	PgStat_Counter spill_txns;
 	PgStat_Counter spill_count;
 	PgStat_Counter spill_bytes;
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 40a36028554..627c1389e4c 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -242,7 +242,8 @@ typedef struct PgStat_KindInfo
 	/*
 	 * For variable-numbered stats with named_on_disk. Optional.
 	 */
-	void		(*to_serialized_name) (const PgStatShared_Common *header, NameData *name);
+	void		(*to_serialized_name) (const PgStat_HashKey *key,
+									   const PgStatShared_Common *header, NameData *name);
 	bool		(*from_serialized_name) (const NameData *name, PgStat_HashKey *key);
 
 	/*
@@ -567,7 +568,7 @@ extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
  */
 
 extern void pgstat_replslot_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
-extern void pgstat_replslot_to_serialized_name_cb(const PgStatShared_Common *header, NameData *name);
+extern void pgstat_replslot_to_serialized_name_cb(const PgStat_HashKey *key, const PgStatShared_Common *header, NameData *name);
 extern bool pgstat_replslot_from_serialized_name_cb(const NameData *name, PgStat_HashKey *key);
 
 
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 609f0b1ad86..1b97597f17c 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -1367,7 +1367,7 @@ pgstat_write_statsfile(void)
 			/* stats entry identified by name on disk (e.g. slots) */
 			NameData	name;
 
-			kind_info->to_serialized_name(shstats, &name);
+			kind_info->to_serialized_name(&ps->key, shstats, &name);
 
 			fputc('N', fpout);
 			write_chunk_s(fpout, &ps->key.kind);
diff --git a/src/backend/utils/activity/pgstat_replslot.c b/src/backend/utils/activity/pgstat_replslot.c
index 2039ac3147f..793f26fc950 100644
--- a/src/backend/utils/activity/pgstat_replslot.c
+++ b/src/backend/utils/activity/pgstat_replslot.c
@@ -82,12 +82,6 @@ pgstat_report_replslot(ReplicationSlot *slot, const PgStat_StatReplSlotEntry *re
 	shstatent = (PgStatShared_ReplSlot *) entry_ref->shared_stats;
 	statent = &shstatent->stats;
 
-	/*
-	 * Any mismatch should have been fixed in pgstat_create_replslot() or
-	 * pgstat_acquire_replslot().
-	 */
-	Assert(namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0);
-
 	/* Update the replication slot statistics */
 #define REPLSLOT_ACC(fld) statent->fld += repSlotStat->fld
 	REPLSLOT_ACC(spill_txns);
@@ -124,7 +118,6 @@ pgstat_create_replslot(ReplicationSlot *slot)
 	 * if we previously crashed after dropping a slot.
 	 */
 	memset(&shstatent->stats, 0, sizeof(shstatent->stats));
-	namestrcpy(&shstatent->stats.slotname, NameStr(slot->data.name));
 
 	pgstat_unlock_entry(entry_ref);
 }
@@ -135,27 +128,13 @@ pgstat_create_replslot(ReplicationSlot *slot)
 void
 pgstat_acquire_replslot(ReplicationSlot *slot)
 {
-	PgStat_EntryRef *entry_ref;
-	PgStatShared_ReplSlot *shstatent;
-	PgStat_StatReplSlotEntry *statent;
-
-	entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_REPLSLOT, InvalidOid,
-											ReplicationSlotIndex(slot), false);
-	shstatent = (PgStatShared_ReplSlot *) entry_ref->shared_stats;
-	statent = &shstatent->stats;
-
 	/*
-	 * NB: need to accept that there might be stats from an older slot, e.g.
-	 * if we previously crashed after dropping a slot.
+	 * XXX: I think there cannot actually be data from an older slot
+	 * here. After a crash we throw away the old stats data and if a slot is
+	 * dropped while shut down, we'll not load the slot data at startup.
 	 */
-	if (NameStr(statent->slotname)[0] == 0 ||
-		namestrcmp(&statent->slotname, NameStr(slot->data.name)) != 0)
-	{
-		memset(statent, 0, sizeof(*statent));
-		namestrcpy(&statent->slotname, NameStr(slot->data.name));
-	}
-
-	pgstat_unlock_entry(entry_ref);
+	pgstat_get_entry_ref(PGSTAT_KIND_REPLSLOT, InvalidOid,
+						 ReplicationSlotIndex(slot), false, NULL);
 }
 
 /*
@@ -185,9 +164,10 @@ pgstat_fetch_replslot(NameData slotname)
 }
 
 void
-pgstat_replslot_to_serialized_name_cb(const PgStatShared_Common *header, NameData *name)
+pgstat_replslot_to_serialized_name_cb(const PgStat_HashKey *key, const PgStatShared_Common *header, NameData *name)
 {
-	namestrcpy(name, NameStr(((PgStatShared_ReplSlot *) header)->stats.slotname));
+	/* FIXME: wrap in a slot.c function */
+	namestrcpy(name, NameStr(ReplicationSlotCtl->replication_slots[key->objoid].data.name));
 }
 
 bool
-- 
2.38.0

#22Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andres Freund (#21)
Re: START_REPLICATION SLOT causing a crash in an assert build

On Fri, Oct 7, 2022 at 8:00 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-10-06 14:10:46 +0900, Kyotaro Horiguchi wrote:

+1. FWIW, the atttached is an example of what it looks like if we
avoid file format change.

What about if we go the other direction - simply remove the name from the
stats entry at all. I don't actually think we need it anymore. Unless I am
missing something right now - entirely possible! - the danger that
pgstat_acquire_replslot() mentions doesn't actually exist [anymore]. After a
crash we throw away the old stats data and if a slot is dropped while shut
down, we'll not load the slot data at startup.

+1. I think it works. Since the replication slot index doesn't change
during server running we can fetch the name from
ReplicationSlotCtl->replication_slots.

If we don't need the name in stats entry, pgstat_acquire_replslot() is
no longer necessary?

Regards,

--
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#23Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Masahiko Sawada (#22)
Re: START_REPLICATION SLOT causing a crash in an assert build

At Fri, 7 Oct 2022 12:14:40 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in

What about if we go the other direction - simply remove the name from the
stats entry at all. I don't actually think we need it anymore. Unless I am
missing something right now - entirely possible! - the danger that
pgstat_acquire_replslot() mentions doesn't actually exist [anymore]. After a
crash we throw away the old stats data and if a slot is dropped while shut
down, we'll not load the slot data at startup.

The key point of this is this:

+	 * XXX: I think there cannot actually be data from an older slot
+	 * here. After a crash we throw away the old stats data and if a slot is
+	 * dropped while shut down, we'll not load the slot data at startup.

I think this is true. Assuming that we don't recreate or rename
objects that have stats after writing out stats, we won't have stats
for a different object with the same name. If we can rely on that
fact, the existing check in pgstat_acquire_replslot() becomes
useless. Thus we don't need to store object name in stats entry. I
agree to that.

+1. I think it works. Since the replication slot index doesn't change
during server running we can fetch the name from
ReplicationSlotCtl->replication_slots.

That access seems safe in a bit different aspect, too. Both
checkpointer (and walsender) properly initialize ReplicationSlotCtl.

If we don't need the name in stats entry, pgstat_acquire_replslot() is
no longer necessary?

I think so. The entry will be created at the first report.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#24Andres Freund
andres@anarazel.de
In reply to: Kyotaro Horiguchi (#23)
Re: START_REPLICATION SLOT causing a crash in an assert build

Hi,

On 2022-10-07 15:30:43 +0900, Kyotaro Horiguchi wrote:

At Fri, 7 Oct 2022 12:14:40 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in

What about if we go the other direction - simply remove the name from the
stats entry at all. I don't actually think we need it anymore. Unless I am
missing something right now - entirely possible! - the danger that
pgstat_acquire_replslot() mentions doesn't actually exist [anymore]. After a
crash we throw away the old stats data and if a slot is dropped while shut
down, we'll not load the slot data at startup.

The key point of this is this:

+	 * XXX: I think there cannot actually be data from an older slot
+	 * here. After a crash we throw away the old stats data and if a slot is
+	 * dropped while shut down, we'll not load the slot data at startup.

I think this is true. Assuming that we don't recreate or rename
objects that have stats after writing out stats, we won't have stats
for a different object with the same name.

Thanks for thinking through this!

If we can rely on that fact, the existing check in pgstat_acquire_replslot()
becomes useless. Thus we don't need to store object name in stats entry. I
agree to that.

I don't agree with this aspect. I think it's better to ensure that the stats
object exists when acquiring a slot, rather than later, when reporting. It's a
lot simpler to think through the lock nesting etc that way.

I'd also eventually like to remove the stats that are currently kept
separately in ReorderBuffer, and that will be easier/cheaper if we can rely on
the stats objects to have already been created.

Greetings,

Andres Freund

#25Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#24)
1 attachment(s)
Re: START_REPLICATION SLOT causing a crash in an assert build

Hi,

On 2022-10-07 12:00:56 -0700, Andres Freund wrote:

On 2022-10-07 15:30:43 +0900, Kyotaro Horiguchi wrote:

The key point of this is this:

+	 * XXX: I think there cannot actually be data from an older slot
+	 * here. After a crash we throw away the old stats data and if a slot is
+	 * dropped while shut down, we'll not load the slot data at startup.

I think this is true. Assuming that we don't recreate or rename
objects that have stats after writing out stats, we won't have stats
for a different object with the same name.

Thanks for thinking through this!

If we can rely on that fact, the existing check in pgstat_acquire_replslot()
becomes useless. Thus we don't need to store object name in stats entry. I
agree to that.

I don't agree with this aspect. I think it's better to ensure that the stats
object exists when acquiring a slot, rather than later, when reporting. It's a
lot simpler to think through the lock nesting etc that way.

I'd also eventually like to remove the stats that are currently kept
separately in ReorderBuffer, and that will be easier/cheaper if we can rely on
the stats objects to have already been created.

Here's a cleaned up version of my earlier prototype.

- I wrapped the access to ReplicationSlotCtl->replication_slots[i].data.name
in a new function bool ReplicationSlotName(index, name). I'm not entirely
happy with that name, as it sounds like a more general accessor than it is -
I toyed with ReplicationSlotNameForIndex(), but that seemed somewhat
unnecessary, I don't forsee a need for another name accessor.

Anyone wants to weigh in?

- Substantial comment and commit message polishing

- I'm planning to drop PgStat_StatReplSlotEntry.slotname and a
PGSTAT_FILE_FORMAT_ID bump in master and to rename slotname to
slotname_unused in 15.

- Self-review found a bug, I copy-pasted create=false in the call to
pgstat_get_entry_ref() in pgstat_acquire_replslot(). This did *NOT* cause
any test failures - clearly our test coverage in this area is woefully
inadequate.

- This patch does not contain a test for the fix. I think this can only be
tested by a tap test starting pg_recvlogical in the background and checking
pg_recvlogical's output. That type of test is notoriously hard to be
reliable, so committing it shortly before the release is wrapped seems like
a bad idea.

I manually verified that:
- a continually active slot reporting stats after pgstat_reset_replslot()
works correctly (this is what crashed before)
- replslot stats reporting works correctly after stats were removed
- upgrading from pre-fix to post-fix preserves stats (when keeping slotname /
not increasing the stats version, of course)

I'm planning to push this either later tonight (if I feel up to it after
cooking dinner) or tomorrow morning PST, due to the release wrap deadline.

Greetings,

Andres Freund

Attachments:

v4-0001-pgstat-Prevent-stats-reset-from-corrupting-slotna.patchtext/x-diff; charset=us-asciiDownload
From 4c13dc7657fb7b9a853055b693564706847b7841 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Fri, 7 Oct 2022 19:37:48 -0700
Subject: [PATCH v4] pgstat: Prevent stats reset from corrupting slotname by
 removing slotname

Previously PgStat_StatReplSlotEntry contained the slotname, which was mainly
used when writing out the stats during shutdown, to identify the slot in the
serialized data (at runtime the index in ReplicationSlotCtl->replication_slots
is used, but that can change during a restart). Unfortunately the slotname was
overwritten when the slot's stats were reset.

That turned out to only cause "real" problems if the slot was active during
the reset, triggering an assertion failure at the next
pgstat_report_replslot(). In other paths the stats were re-initialized during
pgstat_acquire_replslot().

Fix this by removing slotname. Instead we can get the slot's name from the
slot itself. Besides fixing a bug, this also is architecturally cleaner (a
name is not really statistics). This is safe because stats, for a slot removed
while shut down, will not be restored at startup.

In 15 the slotname is not removed, but renamed, to avoid changing the stats
format. In master, bump PGSTAT_FILE_FORMAT_ID.

This commit does not contain a test for the fix. I think this can only be
tested by a tap test starting pg_recvlogical in the background and checking
pg_recvlogical's output. That type of test is notoriously hard to be reliable,
so committing it shortly before the release is wrapped seems like a bad idea.

Reported-by: Jaime Casanova <jcasanov@systemguards.com.ec>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/YxfagaTXUNa9ggLb@ahch-to
Backpatch: 15-, where the bug was introduced in 5891c7a8ed8f
---
 src/include/pgstat.h                         |  8 ++-
 src/include/replication/slot.h               |  1 +
 src/include/utils/pgstat_internal.h          |  5 +-
 src/backend/replication/slot.c               | 28 ++++++++++
 src/backend/utils/activity/pgstat.c          |  2 +-
 src/backend/utils/activity/pgstat_replslot.c | 56 +++++++++-----------
 6 files changed, 65 insertions(+), 35 deletions(-)

diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ad7334a0d21..b6cc9e47942 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -242,7 +242,8 @@ typedef struct PgStat_TableXactStatus
  * ------------------------------------------------------------
  */
 
-#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA7
+/* FIXME: before push: only in master */
+#define PGSTAT_FILE_FORMAT_ID	0x01A5BCA8
 
 typedef struct PgStat_ArchiverStats
 {
@@ -321,7 +322,10 @@ typedef struct PgStat_StatFuncEntry
 
 typedef struct PgStat_StatReplSlotEntry
 {
-	NameData	slotname;
+	/* FIXME: perform branch specific action before push */
+#if IN_PG_15_ONLY_UNUSED
+	NameData	slotname_unused;
+#endif
 	PgStat_Counter spill_txns;
 	PgStat_Counter spill_count;
 	PgStat_Counter spill_bytes;
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 8d5e764aef5..65f2c74239d 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -218,6 +218,7 @@ extern void ReplicationSlotsDropDBSlots(Oid dboid);
 extern bool InvalidateObsoleteReplicationSlots(XLogSegNo oldestSegno);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
+extern bool ReplicationSlotName(int index, Name name);
 extern void ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char *syncslotname, Size szslot);
 extern void ReplicationSlotDropAtPubNode(WalReceiverConn *wrconn, char *slotname, bool missing_ok);
 
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 40a36028554..627c1389e4c 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -242,7 +242,8 @@ typedef struct PgStat_KindInfo
 	/*
 	 * For variable-numbered stats with named_on_disk. Optional.
 	 */
-	void		(*to_serialized_name) (const PgStatShared_Common *header, NameData *name);
+	void		(*to_serialized_name) (const PgStat_HashKey *key,
+									   const PgStatShared_Common *header, NameData *name);
 	bool		(*from_serialized_name) (const NameData *name, PgStat_HashKey *key);
 
 	/*
@@ -567,7 +568,7 @@ extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
  */
 
 extern void pgstat_replslot_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
-extern void pgstat_replslot_to_serialized_name_cb(const PgStatShared_Common *header, NameData *name);
+extern void pgstat_replslot_to_serialized_name_cb(const PgStat_HashKey *key, const PgStatShared_Common *header, NameData *name);
 extern bool pgstat_replslot_from_serialized_name_cb(const NameData *name, PgStat_HashKey *key);
 
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index e9328961dd3..d58d16e992a 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -412,6 +412,34 @@ ReplicationSlotIndex(ReplicationSlot *slot)
 	return slot - ReplicationSlotCtl->replication_slots;
 }
 
+/*
+ * If the slot at 'index' is unused, return false. Otherwise 'name' is set to
+ * the slot's name and true is returned.
+ *
+ * This likely is only useful for pgstat_replslot.c during shutdown, in other
+ * cases there are obvious TOCTOU issues.
+ */
+bool
+ReplicationSlotName(int index, Name name)
+{
+	ReplicationSlot *slot;
+	bool		found;
+
+	slot = &ReplicationSlotCtl->replication_slots[index];
+
+	/*
+	 * Ensure that the slot cannot be dropped while we copy the name. Don't
+	 * need the spinlock as the name of an existing slot cannot change.
+	 */
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+	found = slot->in_use;
+	if (slot->in_use)
+		namestrcpy(name, NameStr(slot->data.name));
+	LWLockRelease(ReplicationSlotControlLock);
+
+	return found;
+}
+
 /*
  * Find a previously created slot and mark it as used by this process.
  *
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 609f0b1ad86..1b97597f17c 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -1367,7 +1367,7 @@ pgstat_write_statsfile(void)
 			/* stats entry identified by name on disk (e.g. slots) */
 			NameData	name;
 
-			kind_info->to_serialized_name(shstats, &name);
+			kind_info->to_serialized_name(&ps->key, shstats, &name);
 
 			fputc('N', fpout);
 			write_chunk_s(fpout, &ps->key.kind);
diff --git a/src/backend/utils/activity/pgstat_replslot.c b/src/backend/utils/activity/pgstat_replslot.c
index 2039ac3147f..252d2e4e078 100644
--- a/src/backend/utils/activity/pgstat_replslot.c
+++ b/src/backend/utils/activity/pgstat_replslot.c
@@ -69,6 +69,10 @@ pgstat_reset_replslot(const char *name)
 
 /*
  * Report replication slot statistics.
+ *
+ * We can rely on the stats for the slot to exist and to belong to this
+ * slot. We can only get here if pgstat_create_replslot() or
+ * pgstat_acquire_replslot() have already been called.
  */
 void
 pgstat_report_replslot(ReplicationSlot *slot, const PgStat_StatReplSlotEntry *repSlotStat)
@@ -82,12 +86,6 @@ pgstat_report_replslot(ReplicationSlot *slot, const PgStat_StatReplSlotEntry *re
 	shstatent = (PgStatShared_ReplSlot *) entry_ref->shared_stats;
 	statent = &shstatent->stats;
 
-	/*
-	 * Any mismatch should have been fixed in pgstat_create_replslot() or
-	 * pgstat_acquire_replslot().
-	 */
-	Assert(namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0);
-
 	/* Update the replication slot statistics */
 #define REPLSLOT_ACC(fld) statent->fld += repSlotStat->fld
 	REPLSLOT_ACC(spill_txns);
@@ -124,38 +122,29 @@ pgstat_create_replslot(ReplicationSlot *slot)
 	 * if we previously crashed after dropping a slot.
 	 */
 	memset(&shstatent->stats, 0, sizeof(shstatent->stats));
-	namestrcpy(&shstatent->stats.slotname, NameStr(slot->data.name));
 
 	pgstat_unlock_entry(entry_ref);
 }
 
 /*
  * Report replication slot has been acquired.
+ *
+ * This guarantees that a stats entry exists during later
+ * pgstat_report_replslot() calls.
+ *
+ * If we previously crashed, no stats data exists. But if we did not crash,
+ * the stats do belong to this slot:
+ * - the stats cannot belong to a dropped slot, pgstat_drop_replslot() would
+ *   have been called
+ * - if the slot was removed while shut down,
+ *   pgstat_replslot_from_serialized_name_cb() returning false would have
+ *   caused the stats to be dropped
  */
 void
 pgstat_acquire_replslot(ReplicationSlot *slot)
 {
-	PgStat_EntryRef *entry_ref;
-	PgStatShared_ReplSlot *shstatent;
-	PgStat_StatReplSlotEntry *statent;
-
-	entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_REPLSLOT, InvalidOid,
-											ReplicationSlotIndex(slot), false);
-	shstatent = (PgStatShared_ReplSlot *) entry_ref->shared_stats;
-	statent = &shstatent->stats;
-
-	/*
-	 * NB: need to accept that there might be stats from an older slot, e.g.
-	 * if we previously crashed after dropping a slot.
-	 */
-	if (NameStr(statent->slotname)[0] == 0 ||
-		namestrcmp(&statent->slotname, NameStr(slot->data.name)) != 0)
-	{
-		memset(statent, 0, sizeof(*statent));
-		namestrcpy(&statent->slotname, NameStr(slot->data.name));
-	}
-
-	pgstat_unlock_entry(entry_ref);
+	pgstat_get_entry_ref(PGSTAT_KIND_REPLSLOT, InvalidOid,
+						 ReplicationSlotIndex(slot), true, NULL);
 }
 
 /*
@@ -185,9 +174,16 @@ pgstat_fetch_replslot(NameData slotname)
 }
 
 void
-pgstat_replslot_to_serialized_name_cb(const PgStatShared_Common *header, NameData *name)
+pgstat_replslot_to_serialized_name_cb(const PgStat_HashKey *key, const PgStatShared_Common *header, NameData *name)
 {
-	namestrcpy(name, NameStr(((PgStatShared_ReplSlot *) header)->stats.slotname));
+	/*
+	 * This is only called late during shutdown. The set of existing slots
+	 * isn't allowed to change at this point, we can assume that a slot exists
+	 * at the offset.
+	 */
+	if (!ReplicationSlotName(key->objoid, name))
+		elog(ERROR, "could not find name for replication slot index %u",
+			 key->objoid);
 }
 
 bool
-- 
2.38.0

#26Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#25)
Re: START_REPLICATION SLOT causing a crash in an assert build

Hi,

On 2022-10-07 19:56:33 -0700, Andres Freund wrote:

I'm planning to push this either later tonight (if I feel up to it after
cooking dinner) or tomorrow morning PST, due to the release wrap deadline.

I looked this over again, tested a bit more, and pushed the adjusted 15 and
master versions to github to get a CI run. Once that passes, as I expect, I'll
push them for real.

Greetings,

Andres Freund

#27Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#26)
Re: START_REPLICATION SLOT causing a crash in an assert build

On 2022-10-08 09:53:50 -0700, Andres Freund wrote:

On 2022-10-07 19:56:33 -0700, Andres Freund wrote:

I'm planning to push this either later tonight (if I feel up to it after
cooking dinner) or tomorrow morning PST, due to the release wrap deadline.

I looked this over again, tested a bit more, and pushed the adjusted 15 and
master versions to github to get a CI run. Once that passes, as I expect, I'll
push them for real.

Those passed and thus pushed.

Thanks for the report, debugging and review everyone!

I think we need at least the following tests for replslots:
- a reset while decoding is ongoing works correctly
- replslot stats continue to be accumulated after stats have been removed

I wonder how much it'd take to teach isolationtester to handle the replication
protocol...

#28Jonathan S. Katz
jkatz@postgresql.org
In reply to: Andres Freund (#27)
Re: START_REPLICATION SLOT causing a crash in an assert build

On 10/8/22 1:40 PM, Andres Freund wrote:

On 2022-10-08 09:53:50 -0700, Andres Freund wrote:

On 2022-10-07 19:56:33 -0700, Andres Freund wrote:

I'm planning to push this either later tonight (if I feel up to it after
cooking dinner) or tomorrow morning PST, due to the release wrap deadline.

I looked this over again, tested a bit more, and pushed the adjusted 15 and
master versions to github to get a CI run. Once that passes, as I expect, I'll
push them for real.

Those passed and thus pushed.

Thanks for the report, debugging and review everyone!

Thanks for the quick turnaround! I've closed the open item.

Thanks,

Jonathan

#29Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andres Freund (#27)
1 attachment(s)
Re: START_REPLICATION SLOT causing a crash in an assert build

On Sun, Oct 9, 2022 at 2:42 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-10-08 09:53:50 -0700, Andres Freund wrote:

On 2022-10-07 19:56:33 -0700, Andres Freund wrote:

I'm planning to push this either later tonight (if I feel up to it after
cooking dinner) or tomorrow morning PST, due to the release wrap deadline.

I looked this over again, tested a bit more, and pushed the adjusted 15 and
master versions to github to get a CI run. Once that passes, as I expect, I'll
push them for real.

Those passed and thus pushed.

Thanks for the report, debugging and review everyone!

Thanks!

I think we need at least the following tests for replslots:
- a reset while decoding is ongoing works correctly
- replslot stats continue to be accumulated after stats have been removed

I wonder how much it'd take to teach isolationtester to handle the replication
protocol...

I think we can do these tests by using pg_recvlogical in TAP tests.
I've attached a patch to do that.

Regards,

--
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

regression_test_for_replslot_stats.patchapplication/octet-stream; name=regression_test_for_replslot_stats.patchDownload
diff --git a/contrib/test_decoding/t/001_repl_stats.pl b/contrib/test_decoding/t/001_repl_stats.pl
index daf03ff147..513cebf2ad 100644
--- a/contrib/test_decoding/t/001_repl_stats.pl
+++ b/contrib/test_decoding/t/001_repl_stats.pl
@@ -1,8 +1,7 @@
 
 # Copyright (c) 2021-2022, PostgreSQL Global Development Group
 
-# Test replication statistics data in pg_stat_replication_slots is sane after
-# drop replication slot and restart.
+# Test replication statistics data in pg_stat_replication_slots
 use strict;
 use warnings;
 use File::Path qw(rmtree);
@@ -10,6 +9,9 @@ use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
 
+# Test if statistics data in pg_stat_replication_slots is sane after drop replication
+# slot and restart.
+
 # Test set-up
 my $node = PostgreSQL::Test::Cluster->new('test');
 $node->init(allows_streaming => 'logical');
@@ -118,4 +120,78 @@ $node->safe_psql('postgres',
 # shutdown
 $node->stop;
 
+# Test if resetting statistics data in pg_stat_replication_slots while decoding is
+# ongoing works correctly.
+$node = PostgreSQL::Test::Cluster->new('test2');
+$node->init(allows_streaming => 'logical');
+$node->start;
+
+$node->safe_psql('postgres', "CREATE TABLE test(i int)");
+
+# The walsender process creates and acquires the replication slot.
+my $pg_recvlogical = IPC::Run::start(
+	[
+		'pg_recvlogical',           '-S',
+		'regression_slot',          '-d',
+		$node->connstr('postgres'), '--start',
+		'--create-slot',            '--no-loop',
+		'-f',                       '-'
+	]);
+
+# Wait for the replication slot to become active.
+$node->poll_query_until('postgres',
+	"SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = 'regression_slot' AND active_pid IS NOT NULL)"
+) or die "slot never became active";
+
+# Reset the replication slot statistics.
+$node->safe_psql('postgres',
+	"SELECT pg_stat_reset_replication_slot('regression_slot');");
+my $result = $node->safe_psql('postgres',
+	"SELECT * FROM pg_stat_replication_slots WHERE slot_name = 'regrssion_slot'"
+);
+is($result, "", "replication slot statistics are reset");
+
+# Test if the walsender properly updates the statistics.
+$node->safe_psql('postgres', "INSERT INTO test VALUES (1)");
+$node->poll_query_until('postgres',
+	"SELECT total_txns > 0 FROM pg_stat_replication_slots WHERE slot_name = 'regression_slot'"
+);
+
+# Teardown the node so the statistics is removed.
+$pg_recvlogical->kill_kill;
+$node->teardown_node;
+$node->start;
+
+# Start pg_recvlogical again.
+$pg_recvlogical = IPC::Run::start(
+	[
+		'pg_recvlogical',           '-S',
+		'regression_slot',          '-d',
+		$node->connstr('postgres'), '--start',
+		'--no-loop',                '-f',
+		'-'
+	]);
+
+# Wait for the slot to become active.
+$node->poll_query_until('postgres',
+	"SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = 'regression_slot' AND active_pid IS NOT NULL)"
+) or die "slot never became active";
+
+# Check if the replication slot statistics have been removed.
+$result = $node->safe_psql('postgres',
+	"SELECT * FROM pg_stat_replication_slots WHERE slot_name = 'regrssion_slot'"
+);
+is($result, "", "replication slot statistics are removed");
+
+# Test if the replication slot staistics continue to be accumulated even after
+# statistics have been removed.
+$node->safe_psql('postgres', "INSERT INTO test VALUES (1)");
+$node->poll_query_until('postgres',
+	"SELECT total_txns > 0 FROM pg_stat_replication_slots WHERE slot_name = 'regression_slot'"
+);
+
+# shutdown
+$pg_recvlogical->kill_kill;
+$node->stop;
+
 done_testing();
#30Andres Freund
andres@anarazel.de
In reply to: Masahiko Sawada (#29)
Re: START_REPLICATION SLOT causing a crash in an assert build

Hi,

On 2022-10-11 17:10:52 +0900, Masahiko Sawada wrote:

+# Reset the replication slot statistics.
+$node->safe_psql('postgres',
+	"SELECT pg_stat_reset_replication_slot('regression_slot');");
+my $result = $node->safe_psql('postgres',
+	"SELECT * FROM pg_stat_replication_slots WHERE slot_name = 'regrssion_slot'"
+);

Typo in the slot name "regrssion_slot" instead of "regression_slot". We can't
use * here, because that'll include the reset timestamp.

+# Teardown the node so the statistics is removed.
+$pg_recvlogical->kill_kill;
+$node->teardown_node;
+$node->start;

ISTM that removing the file instead of shutting down the cluster with force
would make it a more targeted test.

+# Check if the replication slot statistics have been removed.
+$result = $node->safe_psql('postgres',
+	"SELECT * FROM pg_stat_replication_slots WHERE slot_name = 'regrssion_slot'"
+);
+is($result, "", "replication slot statistics are removed");

Same typo as above. We can't assert a specific result here either, because
recvlogical will have processed a bunch of changes. Perhaps we could check at
least that the reset time is NULL?

+# Test if the replication slot staistics continue to be accumulated even after

s/staistics/statistics/

Greetings,

Andres Freund

#31Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andres Freund (#30)
1 attachment(s)
Re: START_REPLICATION SLOT causing a crash in an assert build

On Thu, Oct 13, 2022 at 1:21 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-10-11 17:10:52 +0900, Masahiko Sawada wrote:

+# Reset the replication slot statistics.
+$node->safe_psql('postgres',
+     "SELECT pg_stat_reset_replication_slot('regression_slot');");
+my $result = $node->safe_psql('postgres',
+     "SELECT * FROM pg_stat_replication_slots WHERE slot_name = 'regrssion_slot'"
+);

Typo in the slot name "regrssion_slot" instead of "regression_slot". We can't
use * here, because that'll include the reset timestamp.

Fixed.

+# Teardown the node so the statistics is removed.
+$pg_recvlogical->kill_kill;
+$node->teardown_node;
+$node->start;

ISTM that removing the file instead of shutting down the cluster with force
would make it a more targeted test.

Agreed.

+# Check if the replication slot statistics have been removed.
+$result = $node->safe_psql('postgres',
+     "SELECT * FROM pg_stat_replication_slots WHERE slot_name = 'regrssion_slot'"
+);
+is($result, "", "replication slot statistics are removed");

Same typo as above. We can't assert a specific result here either, because
recvlogical will have processed a bunch of changes. Perhaps we could check at
least that the reset time is NULL?

Agreed.

+# Test if the replication slot staistics continue to be accumulated even after

s/staistics/statistics/

Fixed.

I've attached an updated patch. I've added the common function to
start pg_recvlogical and wait for it to become active. Please review
it.

Regards,

--
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

regression_test_for_replslot_stats_v2.patchapplication/octet-stream; name=regression_test_for_replslot_stats_v2.patchDownload
diff --git a/contrib/test_decoding/t/001_repl_stats.pl b/contrib/test_decoding/t/001_repl_stats.pl
index daf03ff147..6be3003fb0 100644
--- a/contrib/test_decoding/t/001_repl_stats.pl
+++ b/contrib/test_decoding/t/001_repl_stats.pl
@@ -1,8 +1,7 @@
 
 # Copyright (c) 2021-2022, PostgreSQL Global Development Group
 
-# Test replication statistics data in pg_stat_replication_slots is sane after
-# drop replication slot and restart.
+# Test replication statistics data in pg_stat_replication_slots
 use strict;
 use warnings;
 use File::Path qw(rmtree);
@@ -10,6 +9,9 @@ use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
 
+# Test if statistics data in pg_stat_replication_slots is sane after drop replication
+# slot and restart.
+
 # Test set-up
 my $node = PostgreSQL::Test::Cluster->new('test');
 $node->init(allows_streaming => 'logical');
@@ -118,4 +120,84 @@ $node->safe_psql('postgres',
 # shutdown
 $node->stop;
 
+# Test if resetting statistics data in pg_stat_replication_slots while decoding is
+# ongoing works correctly.
+
+# Start pg_recvlogical process and wait for it to become active.
+sub start_pg_recvlogical
+{
+	my ($node, $slot_name, $create_slot) = @_;
+
+	my @cmd = (
+		'pg_recvlogical', '-S', "$slot_name", '-d',
+		$node->connstr('postgres'),
+		'--start', '--no-loop', '-f', '-');
+	push @cmd, '--create-slot' if $create_slot;
+
+	# start pg_recvlogical process.
+	my $pg_recvlogical = IPC::Run::start(@cmd);
+
+	# Wait for the replication slot to become active.
+	$node->poll_query_until('postgres',
+		"SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = '$slot_name' AND active_pid IS NOT NULL)"
+	) or die "slot never became active";
+
+	return $pg_recvlogical;
+}
+
+$node = PostgreSQL::Test::Cluster->new('test2');
+$node->init(allows_streaming => 'logical');
+$node->start;
+$node->safe_psql('postgres', "CREATE TABLE test(i int)");
+
+# Start pg_recvlogical process. The walsender process creates and acquires
+# the replication slot.
+my $slot_name = "regression_slot";
+my $pg_recvlogical = start_pg_recvlogical($node, $slot_name, 1);
+
+# Reset the replication slot statistics.
+$node->safe_psql('postgres',
+	"SELECT pg_stat_reset_replication_slot('$slot_name');");
+my $result = $node->safe_psql('postgres',
+	"SELECT stats_reset IS NOT NULL FROM pg_stat_replication_slots WHERE slot_name = '$slot_name'"
+);
+is($result, "t", "replication slot statistics are reset");
+
+# Test if the walsender properly updates the statistics.
+$node->safe_psql('postgres', "INSERT INTO test VALUES (1)");
+$node->poll_query_until('postgres',
+	"SELECT total_txns > 0 FROM pg_stat_replication_slots WHERE slot_name = '$slot_name'"
+);
+
+# Shutdown
+$pg_recvlogical->kill_kill;
+$node->stop;
+
+# Remove the stats file and restart the server.
+$datadir = $node->data_dir;
+my $stats_file = "$datadir/pg_stat/pgstat.stat";
+unlink($stats_file)
+  or die "could not remove $stats_file";
+$node->start;
+
+# Start pg_recvlogical again.
+$pg_recvlogical = start_pg_recvlogical($node, $slot_name, 0);
+
+# Check if the replication slot statistics have been removed.
+$result = $node->safe_psql('postgres',
+	"SELECT stats_reset IS NULL FROM pg_stat_replication_slots WHERE slot_name = '$slot_name'"
+);
+is($result, "t", "replication slot statistics are removed");
+
+# Test if the replication slot statistics continue to be accumulated even
+# after statistics have been removed.
+$node->safe_psql('postgres', "INSERT INTO test VALUES (1)");
+$node->poll_query_until('postgres',
+	"SELECT total_txns > 0 FROM pg_stat_replication_slots WHERE slot_name = '$slot_name'"
+);
+
+# Shutdown
+$pg_recvlogical->kill_kill;
+$node->stop;
+
 done_testing();
#32Andres Freund
andres@anarazel.de
In reply to: Masahiko Sawada (#31)
Re: START_REPLICATION SLOT causing a crash in an assert build

Hi,

On 2022-10-13 15:57:28 +0900, Masahiko Sawada wrote:

I've attached an updated patch. I've added the common function to
start pg_recvlogical and wait for it to become active. Please review
it.

+# Start pg_recvlogical process and wait for it to become active.
+sub start_pg_recvlogical
+{
+	my ($node, $slot_name, $create_slot) = @_;
+
+	my @cmd = (
+		'pg_recvlogical', '-S', "$slot_name", '-d',
+		$node->connstr('postgres'),
+		'--start', '--no-loop', '-f', '-');
+	push @cmd, '--create-slot' if $create_slot;
+
+	# start pg_recvlogical process.
+	my $pg_recvlogical = IPC::Run::start(@cmd);
+
+	# Wait for the replication slot to become active.
+	$node->poll_query_until('postgres',
+		"SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = '$slot_name' AND active_pid IS NOT NULL)"
+	) or die "slot never became active";
+
+	return $pg_recvlogical;
+}

Because you never process the output from pg_recvlogical I think this test
will fail if the pipe buffer is small (or more changes are made). I think
either it needs to output to a file, or we need to process the output.

A file seems the easier solution in this case, we don't really care about what
changes are streamed to the client, right?

+$node = PostgreSQL::Test::Cluster->new('test2');
+$node->init(allows_streaming => 'logical');
+$node->start;
+$node->safe_psql('postgres', "CREATE TABLE test(i int)");

Why are we creating a new cluster? Initdbs takes a fair bit of time on some
platforms, so this seems unnecessary?

Greetings,

Andres Freund

#33Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andres Freund (#32)
Re: START_REPLICATION SLOT causing a crash in an assert build

On Thu, Oct 20, 2022 at 6:54 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-10-13 15:57:28 +0900, Masahiko Sawada wrote:

I've attached an updated patch. I've added the common function to
start pg_recvlogical and wait for it to become active. Please review
it.

+# Start pg_recvlogical process and wait for it to become active.
+sub start_pg_recvlogical
+{
+     my ($node, $slot_name, $create_slot) = @_;
+
+     my @cmd = (
+             'pg_recvlogical', '-S', "$slot_name", '-d',
+             $node->connstr('postgres'),
+             '--start', '--no-loop', '-f', '-');
+     push @cmd, '--create-slot' if $create_slot;
+
+     # start pg_recvlogical process.
+     my $pg_recvlogical = IPC::Run::start(@cmd);
+
+     # Wait for the replication slot to become active.
+     $node->poll_query_until('postgres',
+             "SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = '$slot_name' AND active_pid IS NOT NULL)"
+     ) or die "slot never became active";
+
+     return $pg_recvlogical;
+}

Because you never process the output from pg_recvlogical I think this test
will fail if the pipe buffer is small (or more changes are made). I think
either it needs to output to a file, or we need to process the output.

Okay, but how can we test this situation? As far as I tested, if we
don't specify the redirection of pg_recvlogical's output as above,
pg_recvlogical's stdout and stderr are output to the log file. So I
could not reproduce the issue you're concerned about. Which pipe do
you refer to?

A file seems the easier solution in this case, we don't really care about what
changes are streamed to the client, right?

+$node = PostgreSQL::Test::Cluster->new('test2');
+$node->init(allows_streaming => 'logical');
+$node->start;
+$node->safe_psql('postgres', "CREATE TABLE test(i int)");

Why are we creating a new cluster? Initdbs takes a fair bit of time on some
platforms, so this seems unnecessary?

Agreed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com