START_REPLICATION SLOT causing a crash in an assert build
Hi,
I'm not sure what is causing this, but I have seen this twice. The
second time without activity after changing the set of tables in a
PUBLICATION.
gdb says that debug_query_string contains:
"""
START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')
"""
attached the backtrace.
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
Attachments:
gdb_start_replication_slot_crash.txttext/plain; charset=us-asciiDownload
Hi,
On 2022-09-06 18:40:49 -0500, Jaime Casanova wrote:
I'm not sure what is causing this, but I have seen this twice. The
second time without activity after changing the set of tables in a
PUBLICATION.
Can you describe the steps to reproduce?
Which git commit does this happen on?
gdb says that debug_query_string contains:
"""
START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')
"""attached the backtrace.
#2 0x00005559bfd4f0ed in ExceptionalCondition (
conditionName=0x5559bff30e20 "namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0", errorType=0x5559bff30e0d "FailedAssertion", fileName=0x5559bff30dbb "pgstat_replslot.c",
lineNumber=89) at assert.c:69
what are statent->slotname and slot->data.name?
Greetings,
Andres Freund
On Wed, Sep 07, 2022 at 12:39:08PM -0700, Andres Freund wrote:
Hi,
On 2022-09-06 18:40:49 -0500, Jaime Casanova wrote:
I'm not sure what is causing this, but I have seen this twice. The
second time without activity after changing the set of tables in a
PUBLICATION.Can you describe the steps to reproduce?
I'm still trying to determine that
Which git commit does this happen on?
6e55ea79faa56db85a2b6c5bf94cee8acf8bfdb8 (Stamp 15beta4)
gdb says that debug_query_string contains:
"""
START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')
"""attached the backtrace.
#2 0x00005559bfd4f0ed in ExceptionalCondition (
conditionName=0x5559bff30e20 "namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0", errorType=0x5559bff30e0d "FailedAssertion", fileName=0x5559bff30dbb "pgstat_replslot.c",
lineNumber=89) at assert.c:69what are statent->slotname and slot->data.name?
slot->data.name seems to be the replication_slot record, and
statent->slotname comes from the in shared memory stats for that slot.
And the assert happens when &statent->slotname.data comes empty, which
is not frequent but it happens from time to time
btw, while I'm looking at this I found that we can drop a publication
while there are active subscriptions pointing to it, is that something
we should allow?
anyway, that is not the cause of this because the replication slot actually
exists.
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
On Wed, Sep 07, 2022 at 12:39:08PM -0700, Andres Freund wrote:
Hi,
On 2022-09-06 18:40:49 -0500, Jaime Casanova wrote:
I'm not sure what is causing this, but I have seen this twice. The
second time without activity after changing the set of tables in a
PUBLICATION.
This crash happens after a reset of statistics for a slot replication
Can you describe the steps to reproduce?
bin/pg_ctl -D data1 initdb
bin/pg_ctl -D data1 -l logfile1 -o "-c port=54315 -c wal_level=logical" start
bin/psql -p 54315 postgres <<EOF
create table t1 (i int primary key);
create publication pub1 for table t1;
EOF
bin/pg_ctl -D data2 initdb
bin/pg_ctl -D data2 -l logfile2 -o "-c port=54316" start
bin/psql -p 54316 postgres <<EOF
create table t1 (i int primary key);
create subscription sub1 connection 'host=/tmp port=54315 dbname=postgres' publication pub1;
EOF
bin/psql -p 54315 postgres <<EOF
select pg_stat_reset_replication_slot('sub1');
insert into t1 values(1);
EOF
Which git commit does this happen on?
just tested again on f5047c1293acce3c6c3802b06825aa3a9f9aa55a
gdb says that debug_query_string contains:
"""
START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')START_REPLICATION SLOT "sub_pgbench" LOGICAL 0/0 (proto_version '3', publication_names '"pub_pgbench"')
"""attached the backtrace.
#2 0x00005559bfd4f0ed in ExceptionalCondition (
conditionName=0x5559bff30e20 "namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0", errorType=0x5559bff30e0d "FailedAssertion", fileName=0x5559bff30dbb "pgstat_replslot.c",
lineNumber=89) at assert.c:69what are statent->slotname and slot->data.name?
and the problem seems to be that after zero'ing the stats that includes
the name of the replication slot, this simple patch fixes it... not sure
if it's the right fix though...
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
Attachments:
fix_reset_stat_replslot.patchtext/x-diff; charset=us-asciiDownload+1-0
Nice finding.
At Tue, 13 Sep 2022 00:39:45 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in
and the problem seems to be that after zero'ing the stats that includes
the name of the replication slot, this simple patch fixes it... not sure
if it's the right fix though...
That doesn't work. since what that function clears is not the name in
the slot struct but that in stats entry.
The cause is what pg_stat_reset_replslot wants to do does not match
what pgstat feature thinks as reset.
Unfortunately, we don't have a function to leave a portion alone after
a reset. However, fetching the stats entry in pgstat_reset_replslot is
ugly..
I'm not sure this is less uglier but it works if
pgstat_report_replslot sets the name if it is found to be wiped
out... But that hafly nullify the protction by the assertion just
after.
--- a/src/backend/utils/activity/pgstat_replslot.c
+++ b/src/backend/utils/activity/pgstat_replslot.c
@@ -83,9 +83,11 @@ pgstat_report_replslot(ReplicationSlot *slot, const PgStat_StatReplSlotEntry *re
statent = &shstatent->stats;
/*
- * Any mismatch should have been fixed in pgstat_create_replslot() or
- * pgstat_acquire_replslot().
+ * pgstat_create_replslot() and pgstat_acquire_replslot() enters the name,
+ * but pgstat_reset_replslot() may clear it.
*/
+ if (statent->slotname.data[0] == 0)
+ namestrcpy(&shstatent->stats.slotname, NameStr(slot->data.name));
Assert(namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0);
Another measure would be to add the region to wipe-out on reset to
PgStat_KindInfo, but it seems too much.. (attached)
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
replslot_reset_fix.txttext/plain; charset=us-asciiDownload+20-2
On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote:
Nice finding.
At Tue, 13 Sep 2022 00:39:45 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in
and the problem seems to be that after zero'ing the stats that includes
the name of the replication slot, this simple patch fixes it... not sure
if it's the right fix though...That doesn't work. since what that function clears is not the name in
the slot struct but that in stats entry.
you're right... the curious thing is that I tested it and it worked, but
now it doesn't... maybe it was too late for me...
The cause is what pg_stat_reset_replslot wants to do does not match
what pgstat feature thinks as reset.
[...]
Another measure would be to add the region to wipe-out on reset to
PgStat_KindInfo, but it seems too much.. (attached)
This patch solves the problem, i didn't like the other solution because
as you say it partly nullify the protection of the assertion.
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
On Tue, Sep 13, 2022 at 10:07:50PM -0500, Jaime Casanova wrote:
On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote:
Another measure would be to add the region to wipe-out on reset to
PgStat_KindInfo, but it seems too much.. (attached)This patch solves the problem, i didn't like the other solution because
as you say it partly nullify the protection of the assertion.
I talked too fast, while it solves the immediate problem the patch as is
causes other crashes.
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
At Thu, 15 Sep 2022 01:26:15 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in
On Tue, Sep 13, 2022 at 10:07:50PM -0500, Jaime Casanova wrote:
On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote:
Another measure would be to add the region to wipe-out on reset to
PgStat_KindInfo, but it seems too much.. (attached)This patch solves the problem, i didn't like the other solution because
as you say it partly nullify the protection of the assertion.I talked too fast, while it solves the immediate problem the patch as is
causes other crashes.
Where did the crash happen? Is it a bug introduced by it? Or does it
root to other cause?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Thu, Sep 15, 2022 at 05:30:11PM +0900, Kyotaro Horiguchi wrote:
At Thu, 15 Sep 2022 01:26:15 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in
On Tue, Sep 13, 2022 at 10:07:50PM -0500, Jaime Casanova wrote:
On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote:
Another measure would be to add the region to wipe-out on reset to
PgStat_KindInfo, but it seems too much.. (attached)This patch solves the problem, i didn't like the other solution because
as you say it partly nullify the protection of the assertion.I talked too fast, while it solves the immediate problem the patch as is
causes other crashes.Where did the crash happen? Is it a bug introduced by it? Or does it
root to other cause?
Just compile and run the installcheck tests.
It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside
pgstat_release_entry_ref() because it expects a "deadbeef", it seems to
be a magic variable but cannot find what its use is.
Attached a backtrace.
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
Attachments:
gdb.txttext/plain; charset=us-asciiDownload
At Thu, 15 Sep 2022 11:15:12 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in
It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside
Thanks for the info. I reproduced by make check.. stupid..
It's the thinko about the base address of reset_off.
So the attached doesn't crash..
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
replslot_reset_fix_2.txttext/plain; charset=us-asciiDownload+22-2
On Fri, Sep 16, 2022 at 02:37:17PM +0900, Kyotaro Horiguchi wrote:
At Thu, 15 Sep 2022 11:15:12 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in
It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside
Thanks for the info. I reproduced by make check.. stupid..
It's the thinko about the base address of reset_off.
So the attached doesn't crash..
Hi,
Just confirming there have been no crash since this last patch.
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
At Mon, 19 Sep 2022 11:04:03 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in
On Fri, Sep 16, 2022 at 02:37:17PM +0900, Kyotaro Horiguchi wrote:
At Thu, 15 Sep 2022 11:15:12 -0500, Jaime Casanova <jcasanov@systemguards.com.ec> wrote in
It fails at ./src/backend/utils/activity/pgstat_shmem.c:530 inside
Thanks for the info. I reproduced by make check.. stupid..
It's the thinko about the base address of reset_off.
So the attached doesn't crash..
Hi,
Just confirming there have been no crash since this last patch.
Thanks for confirmation.
Althouh I'm not sure whether this is the right direction, this seems
to be an open item of 15?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Hi,
I wonder if the correct fix here wouldn't be to move the slotname out of
PgStat_StatReplSlotEntry?
On 2022-09-16 14:37:17 +0900, Kyotaro Horiguchi wrote:
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c index 6224c498c2..ed3f3af4d9 100644 --- a/src/backend/utils/activity/pgstat.c +++ b/src/backend/utils/activity/pgstat.c @@ -263,6 +263,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = { .shared_size = sizeof(PgStatShared_Database), .shared_data_off = offsetof(PgStatShared_Database, stats), .shared_data_len = sizeof(((PgStatShared_Database *) 0)->stats), + .reset_off = 0, + .reset_len = sizeof(((PgStatShared_Database *) 0)->stats), .pending_size = sizeof(PgStat_StatDBEntry),.flush_pending_cb = pgstat_database_flush_cb, @@ -277,6 +279,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = { .shared_size = sizeof(PgStatShared_Relation), .shared_data_off = offsetof(PgStatShared_Relation, stats), .shared_data_len = sizeof(((PgStatShared_Relation *) 0)->stats), + .reset_off = 0, + .reset_len = sizeof(((PgStatShared_Relation *) 0)->stats), .pending_size = sizeof(PgStat_TableStatus),.flush_pending_cb = pgstat_relation_flush_cb, @@ -291,6 +295,8 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = { .shared_size = sizeof(PgStatShared_Function), .shared_data_off = offsetof(PgStatShared_Function, stats), .shared_data_len = sizeof(((PgStatShared_Function *) 0)->stats), + .reset_off = 0, + .reset_len = sizeof(((PgStatShared_Function *) 0)->stats), .pending_size = sizeof(PgStat_BackendFunctionEntry),.flush_pending_cb = pgstat_function_flush_cb, @@ -307,6 +313,10 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = { .shared_size = sizeof(PgStatShared_ReplSlot), .shared_data_off = offsetof(PgStatShared_ReplSlot, stats), .shared_data_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats), + /* reset doesn't wipe off slot name */ + .reset_off = offsetof(PgStat_StatReplSlotEntry, spill_txns), + .reset_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats), + offsetof(PgStat_StatReplSlotEntry, spill_txns),
I'm confused what this offsetof does here? It's not even assigned to a
specific field? Am I missing something?
Also, wouldn't we need to subtract something of the size?
diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c index ac98918688..09a8c3873c 100644 --- a/src/backend/utils/activity/pgstat_shmem.c +++ b/src/backend/utils/activity/pgstat_shmem.c @@ -915,8 +915,9 @@ shared_stat_reset_contents(PgStat_Kind kind, PgStatShared_Common *header, { const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);- memset(pgstat_get_entry_data(kind, header), 0, - pgstat_get_entry_len(kind)); + memset((char *)pgstat_get_entry_data(kind, header) + + kind_info->reset_off, 0, + kind_info->reset_len);if (kind_info->reset_timestamp_cb)
kind_info->reset_timestamp_cb(header, ts);
This likely doesn't quite conform to what pgindent wants...
Greetings,
Andres Freund
Thanks!
At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in
I wonder if the correct fix here wouldn't be to move the slotname out of
PgStat_StatReplSlotEntry?
Ugh. Right. I thought its outer struct as purely the part for the
common header. But we can freely place anything after the header
part. I moved it to the outer struct. I didn't clear that part in
pgstat_create_relation() because it is filled in immediately.
The attached is that.
On 2022-09-16 14:37:17 +0900, Kyotaro Horiguchi wrote:
...
@@ -307,6 +313,10 @@ static const PgStat_KindInfo pgstat_kind_infos[PGSTAT_NUM_KINDS] = { .shared_size = sizeof(PgStatShared_ReplSlot), .shared_data_off = offsetof(PgStatShared_ReplSlot, stats), .shared_data_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats), + /* reset doesn't wipe off slot name */ + .reset_off = offsetof(PgStat_StatReplSlotEntry, spill_txns), + .reset_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats), + offsetof(PgStat_StatReplSlotEntry, spill_txns),I'm confused what this offsetof does here? It's not even assigned to a
specific field? Am I missing something?Also, wouldn't we need to subtract something of the size?
Yeah, I felt it confusing. The last line above is offset from just
after the header part (it is PgStat_, not PgStatShared_). I first
wrote that as you suggested but rewrote to shorter representation.
diff --git a/src/backend/utils/activity/pgstat_shmem.c b/src/backend/utils/activity/pgstat_shmem.c index ac98918688..09a8c3873c 100644 --- a/src/backend/utils/activity/pgstat_shmem.c +++ b/src/backend/utils/activity/pgstat_shmem.c @@ -915,8 +915,9 @@ shared_stat_reset_contents(PgStat_Kind kind, PgStatShared_Common *header, { const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);- memset(pgstat_get_entry_data(kind, header), 0, - pgstat_get_entry_len(kind)); + memset((char *)pgstat_get_entry_data(kind, header) + + kind_info->reset_off, 0, + kind_info->reset_len);if (kind_info->reset_timestamp_cb)
kind_info->reset_timestamp_cb(header, ts);This likely doesn't quite conform to what pgindent wants...
In the first place, it's ugly...
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
v1-0001-Do-not-reset-slot-name-of-replication-slot-stats.patchtext/x-patch; charset=us-asciiDownload+7-8
On 9/27/22 1:52 AM, Kyotaro Horiguchi wrote:
Thanks!
At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in
I wonder if the correct fix here wouldn't be to move the slotname out of
PgStat_StatReplSlotEntry?Ugh. Right. I thought its outer struct as purely the part for the
common header. But we can freely place anything after the header
part. I moved it to the outer struct. I didn't clear that part in
pgstat_create_relation() because it is filled in immediately.The attached is that.
This is still listed as an open item[1]https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items for v15. Does this fix proposed
address the issue?
Thanks,
Jonathan
[1]: https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items
Hi,
On 2022-10-05 13:00:53 -0400, Jonathan S. Katz wrote:
On 9/27/22 1:52 AM, Kyotaro Horiguchi wrote:
Thanks!
At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in
I wonder if the correct fix here wouldn't be to move the slotname out of
PgStat_StatReplSlotEntry?Ugh. Right. I thought its outer struct as purely the part for the
common header. But we can freely place anything after the header
part. I moved it to the outer struct. I didn't clear that part in
pgstat_create_relation() because it is filled in immediately.The attached is that.
This is still listed as an open item[1] for v15. Does this fix proposed
address the issue?
Unfortunately not - it doesn't even pass the existing tests
(test_decoding/001_repl_stats fails) :(.
The reason for that is that with the patch nothing restores the slotname when
reading stats from disk. That turns out not to cause immediate issues, but at
the next shutdown the name won't be set, and we'll serialize the stats data
with an empty string as the name.
I have two ideas how to fix it. As a design constraint, I'd be interested in
the RMTs opinion on the following:
Is a cleaner fix that changes the stats format (i.e. existing stats will be
thrown away when upgrading) or one that doesn't change the stats format
preferrable?
Greetings,
Andres Freund
On 10/5/22 8:44 PM, Andres Freund wrote:
Hi,
On 2022-10-05 13:00:53 -0400, Jonathan S. Katz wrote:
On 9/27/22 1:52 AM, Kyotaro Horiguchi wrote:
Thanks!
At Mon, 26 Sep 2022 19:53:02 -0700, Andres Freund <andres@anarazel.de> wrote in
I wonder if the correct fix here wouldn't be to move the slotname out of
PgStat_StatReplSlotEntry?Ugh. Right. I thought its outer struct as purely the part for the
common header. But we can freely place anything after the header
part. I moved it to the outer struct. I didn't clear that part in
pgstat_create_relation() because it is filled in immediately.The attached is that.
This is still listed as an open item[1] for v15. Does this fix proposed
address the issue?Unfortunately not - it doesn't even pass the existing tests
(test_decoding/001_repl_stats fails) :(.
Thanks for checking.
The reason for that is that with the patch nothing restores the slotname when
reading stats from disk. That turns out not to cause immediate issues, but at
the next shutdown the name won't be set, and we'll serialize the stats data
with an empty string as the name.
Ah.
I have two ideas how to fix it. As a design constraint, I'd be interested in
the RMTs opinion on the following:
Is a cleaner fix that changes the stats format (i.e. existing stats will be
thrown away when upgrading) or one that doesn't change the stats format
preferrable?
[My opinion, will let Michael/John chime in]
Ideally we would keep the stats on upgrade -- I think that's the better
user experience.
Thanks,
Jonathan
On Wed, Oct 05, 2022 at 11:24:57PM -0400, Jonathan S. Katz wrote:
On 10/5/22 8:44 PM, Andres Freund wrote:
I have two ideas how to fix it. As a design constraint, I'd be interested in
the RMTs opinion on the following:
Is a cleaner fix that changes the stats format (i.e. existing stats will be
thrown away when upgrading) or one that doesn't change the stats format
preferrable?[My opinion, will let Michael/John chime in]
Ideally we would keep the stats on upgrade -- I think that's the better user
experience.
The release has not happened yet, so I would be fine to remain
flexible and bump again PGSTAT_FILE_FORMAT_ID so as we have the
cleanest approach in place for the release and the future. At the
end, we would throw automatically the data of a file that's marked
with a version that does not match with what we expect at load time,
so there's a limited impact on the user experience, except, well,
losing these stats, of course.
--
Michael
At Thu, 6 Oct 2022 13:44:43 +0900, Michael Paquier <michael@paquier.xyz> wrote in
On Wed, Oct 05, 2022 at 11:24:57PM -0400, Jonathan S. Katz wrote:
On 10/5/22 8:44 PM, Andres Freund wrote:
I have two ideas how to fix it. As a design constraint, I'd be interested in
the RMTs opinion on the following:
Is a cleaner fix that changes the stats format (i.e. existing stats will be
thrown away when upgrading) or one that doesn't change the stats format
preferrable?[My opinion, will let Michael/John chime in]
Ideally we would keep the stats on upgrade -- I think that's the better user
experience.The release has not happened yet, so I would be fine to remain
flexible and bump again PGSTAT_FILE_FORMAT_ID so as we have the
cleanest approach in place for the release and the future. At the
end, we would throw automatically the data of a file that's marked
with a version that does not match with what we expect at load time,
so there's a limited impact on the user experience, except, well,
losing these stats, of course.
+1. FWIW, the atttached is an example of what it looks like if we
avoid file format change.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
v2-0001-If-we-avoid-file-format.patchtext/x-patch; charset=us-asciiDownload+22-10
On 10/6/22 1:10 AM, Kyotaro Horiguchi wrote:
At Thu, 6 Oct 2022 13:44:43 +0900, Michael Paquier <michael@paquier.xyz> wrote in
On Wed, Oct 05, 2022 at 11:24:57PM -0400, Jonathan S. Katz wrote:
On 10/5/22 8:44 PM, Andres Freund wrote:
I have two ideas how to fix it. As a design constraint, I'd be interested in
the RMTs opinion on the following:
Is a cleaner fix that changes the stats format (i.e. existing stats will be
thrown away when upgrading) or one that doesn't change the stats format
preferrable?[My opinion, will let Michael/John chime in]
Ideally we would keep the stats on upgrade -- I think that's the better user
experience.The release has not happened yet, so I would be fine to remain
flexible and bump again PGSTAT_FILE_FORMAT_ID so as we have the
cleanest approach in place for the release and the future.
Yes, I agree with this.
At the
end, we would throw automatically the data of a file that's marked
with a version that does not match with what we expect at load time,
so there's a limited impact on the user experience, except, well,
losing these stats, of course.
I'm fine with this.
+1. FWIW, the atttached is an example of what it looks like if we
avoid file format change.
Thanks for the quick turnaround. I'll let others chime in on the review.
Thanks,
Jonathan