Error "initial slot snapshot too large" in create replication slot

Started by Dilip Kumarover 4 years ago44 messageshackers
Jump to latest
#1Dilip Kumar
dilipbalaut@gmail.com

While creating an "export snapshot" I don't see any protection why the
number of xids in the snapshot can not cross the
"GetMaxSnapshotXidCount()"?.

Basically, while converting the HISTORIC snapshot to the MVCC snapshot
in "SnapBuildInitialSnapshot()", we add all the xids between
snap->xmin to snap->xmax to the MVCC snap->xip array (xids for which
commit were not recorded). The problem is that we add both topxids as
well as the subxids into the same array and expect that the "xid"
count does not cross the "GetMaxSnapshotXidCount()". So it seems like
an issue but I am not sure what is the fix for this, some options are
a) Don't limit the xid count in the exported snapshot and dynamically
resize the array b) Increase the limit to GetMaxSnapshotXidCount() +
GetMaxSnapshotSubxidCount(). But in option b) there would still be a
problem that how do we handle the overflowed subtransaction?

I have locally, reproduced the issue,

1. Configuration
max_connections= 5
autovacuum = off
max_worker_processes = 0

2.Then from pgbench I have run the attached script (test.sql) from 5 clients.
./pgbench -i postgres
./pgbench -c4 -j4 -T 3000 -f test1.sql -P1 postgres

3. Concurrently, create replication slot,
[dilipkumar@localhost bin]$ ./psql "dbname=postgres replication=database"
postgres[7367]=#
postgres[6463]=# CREATE_REPLICATION_SLOT "slot" LOGICAL "test_decoding";
ERROR: 40001: initial slot snapshot too large
LOCATION: SnapBuildInitialSnapshot, snapbuild.c:597
postgres[6463]=# CREATE_REPLICATION_SLOT "slot" LOGICAL "test_decoding";
ERROR: XX000: clearing exported snapshot in wrong transaction state
LOCATION: SnapBuildClearExportedSnapshot, snapbuild.c:690

I could reproduce this issue, at least once in 8-10 attempts of
creating the replication slot.

Note: After that issue, I have noticed one more issue "clearing
exported snapshot in wrong transaction state", that is because the
"ExportInProgress" is not cleared on the transaction abort, for this,
a simple fix is we can clear this state on the transaction abort,
maybe I will raise this as a separate issue?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

test.sqlapplication/sql; name=test.sqlDownload
#2Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Dilip Kumar (#1)
Re: Error "initial slot snapshot too large" in create replication slot

At Mon, 11 Oct 2021 11:49:41 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in

While creating an "export snapshot" I don't see any protection why the
number of xids in the snapshot can not cross the
"GetMaxSnapshotXidCount()"?.

Basically, while converting the HISTORIC snapshot to the MVCC snapshot
in "SnapBuildInitialSnapshot()", we add all the xids between
snap->xmin to snap->xmax to the MVCC snap->xip array (xids for which
commit were not recorded). The problem is that we add both topxids as
well as the subxids into the same array and expect that the "xid"
count does not cross the "GetMaxSnapshotXidCount()". So it seems like
an issue but I am not sure what is the fix for this, some options are

It seems to me that it is a compromise between the restriction of the
legitimate snapshot and snapshots created by snapbuild. If the xids
overflow, the resulting snapshot may lose a siginificant xid, i.e, a
top-level xid.

a) Don't limit the xid count in the exported snapshot and dynamically
resize the array b) Increase the limit to GetMaxSnapshotXidCount() +
GetMaxSnapshotSubxidCount(). But in option b) there would still be a
problem that how do we handle the overflowed subtransaction?

I'm afraid that we shouldn't expand the size limits. If I understand
it correctly, we only need the top-level xids in the exported snapshot
and reorder buffer knows whether a xid is a top-level or not after
establishing a consistent snapshot.

The attached diff tries to make SnapBuildInitialSnapshot exclude
subtransaction from generated snapshots. It seems working fine for
you repro. (Of course, I'm not confident that it is the correct thing,
though..)

What do you think about this?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

exclude_subxact_from_snapbuild_snapshot.diff.txttext/plain; charset=us-asciiDownload+32-5
#3Dilip Kumar
dilipbalaut@gmail.com
In reply to: Kyotaro Horiguchi (#2)
Re: Error "initial slot snapshot too large" in create replication slot

On Mon, Oct 11, 2021 at 4:29 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Mon, 11 Oct 2021 11:49:41 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in

While creating an "export snapshot" I don't see any protection why the
number of xids in the snapshot can not cross the
"GetMaxSnapshotXidCount()"?.

Basically, while converting the HISTORIC snapshot to the MVCC snapshot
in "SnapBuildInitialSnapshot()", we add all the xids between
snap->xmin to snap->xmax to the MVCC snap->xip array (xids for which
commit were not recorded). The problem is that we add both topxids as
well as the subxids into the same array and expect that the "xid"
count does not cross the "GetMaxSnapshotXidCount()". So it seems like
an issue but I am not sure what is the fix for this, some options are

It seems to me that it is a compromise between the restriction of the
legitimate snapshot and snapshots created by snapbuild. If the xids
overflow, the resulting snapshot may lose a siginificant xid, i.e, a
top-level xid.

a) Don't limit the xid count in the exported snapshot and dynamically
resize the array b) Increase the limit to GetMaxSnapshotXidCount() +
GetMaxSnapshotSubxidCount(). But in option b) there would still be a
problem that how do we handle the overflowed subtransaction?

I'm afraid that we shouldn't expand the size limits. If I understand
it correctly, we only need the top-level xids in the exported snapshot

But since we are using this as an MVCC snapshot, if we don't have the
subxid and if we also don't mark the "suboverflowed" flag then IMHO
the sub-transaction visibility might be wrong, Am I missing something?

and reorder buffer knows whether a xid is a top-level or not after
establishing a consistent snapshot.

The attached diff tries to make SnapBuildInitialSnapshot exclude
subtransaction from generated snapshots. It seems working fine for
you repro. (Of course, I'm not confident that it is the correct thing,
though..)

What do you think about this?

If your statement that we only need top-xids in the exported snapshot,
is true then this fix looks fine to me. If not then we might need to
add the sub-xids in the snapshot->subxip array and if it crosses the
limit of GetMaxSnapshotSubxidCount(), then we can mark "suboverflowed"
flag.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#4Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Dilip Kumar (#3)
Re: Error "initial slot snapshot too large" in create replication slot

At Mon, 11 Oct 2021 16:48:10 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in

On Mon, Oct 11, 2021 at 4:29 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Mon, 11 Oct 2021 11:49:41 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in

While creating an "export snapshot" I don't see any protection why the
number of xids in the snapshot can not cross the
"GetMaxSnapshotXidCount()"?.

Basically, while converting the HISTORIC snapshot to the MVCC snapshot
in "SnapBuildInitialSnapshot()", we add all the xids between
snap->xmin to snap->xmax to the MVCC snap->xip array (xids for which
commit were not recorded). The problem is that we add both topxids as
well as the subxids into the same array and expect that the "xid"
count does not cross the "GetMaxSnapshotXidCount()". So it seems like
an issue but I am not sure what is the fix for this, some options are

It seems to me that it is a compromise between the restriction of the
legitimate snapshot and snapshots created by snapbuild. If the xids
overflow, the resulting snapshot may lose a siginificant xid, i.e, a
top-level xid.

a) Don't limit the xid count in the exported snapshot and dynamically
resize the array b) Increase the limit to GetMaxSnapshotXidCount() +
GetMaxSnapshotSubxidCount(). But in option b) there would still be a
problem that how do we handle the overflowed subtransaction?

I'm afraid that we shouldn't expand the size limits. If I understand
it correctly, we only need the top-level xids in the exported snapshot

But since we are using this as an MVCC snapshot, if we don't have the
subxid and if we also don't mark the "suboverflowed" flag then IMHO
the sub-transaction visibility might be wrong, Am I missing something?

Sorry I should have set suboverflowed in the generated snapshot.
However, we can store subxid list as well when the snapshot (or
running_xact) is not overflown. These (should) works the same way.

On physical standby, snapshot is created just filling up only subxip
with all top and sub xids (procrray.c:2400). It would be better we do
the same thing here.

and reorder buffer knows whether a xid is a top-level or not after
establishing a consistent snapshot.

The attached diff tries to make SnapBuildInitialSnapshot exclude
subtransaction from generated snapshots. It seems working fine for
you repro. (Of course, I'm not confident that it is the correct thing,
though..)

What do you think about this?

If your statement that we only need top-xids in the exported snapshot,
is true then this fix looks fine to me. If not then we might need to
add the sub-xids in the snapshot->subxip array and if it crosses the
limit of GetMaxSnapshotSubxidCount(), then we can mark "suboverflowed"
flag.

So I came up with the attached version.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Allow-overflowed-snapshot-when-creating-logical-repl.patchtext/x-patch; charset=us-asciiDownload+59-13
#5Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#4)
Re: Error "initial slot snapshot too large" in create replication slot

At Tue, 12 Oct 2021 13:59:59 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

So I came up with the attached version.

Sorry, it was losing a piece of change.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v2-0001-Allow-overflowed-snapshot-when-creating-logical-r.patchtext/x-patch; charset=us-asciiDownload+65-13
#6Dilip Kumar
dilipbalaut@gmail.com
In reply to: Kyotaro Horiguchi (#5)
Re: Error "initial slot snapshot too large" in create replication slot

On Tue, Oct 12, 2021 at 10:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 12 Oct 2021 13:59:59 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

So I came up with the attached version.

Sorry, it was losing a piece of change.

Yeah, at a high level this is on the idea I have in mind, I will do a
detailed review in a day or two. Thanks for working on this.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#7Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#6)
Re: Error "initial slot snapshot too large" in create replication slot

On Tue, Oct 12, 2021 at 11:30 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 12, 2021 at 10:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 12 Oct 2021 13:59:59 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

So I came up with the attached version.

Sorry, it was losing a piece of change.

Yeah, at a high level this is on the idea I have in mind, I will do a
detailed review in a day or two. Thanks for working on this.

While doing the detailed review, I think there are a couple of
problems with the patch, the main problem of storing all the xid in
the snap->subxip is that once we mark the snapshot overflown then the
XidInMVCCSnapshot, will not search the subxip array, instead it will
fetch the topXid and search in the snap->xip array. Another issue is
that the total xids could be GetMaxSnapshotSubxidCount()
+GetMaxSnapshotXidCount().

I think what we should be doing is that if the xid is know subxid then
add in the snap->subxip array otherwise in snap->xip array. So
snap->xip array size will be GetMaxSnapshotXidCount() whereas the
snap->subxip array size will be GetMaxSnapshotSubxidCount(). And if
the array size is full then we can stop adding the subxids to the
array.

What is your thought on this?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#8Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#7)
Re: Error "initial slot snapshot too large" in create replication slot

On Tue, Oct 19, 2021 at 2:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 12, 2021 at 11:30 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 12, 2021 at 10:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 12 Oct 2021 13:59:59 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

So I came up with the attached version.

Sorry, it was losing a piece of change.

Yeah, at a high level this is on the idea I have in mind, I will do a
detailed review in a day or two. Thanks for working on this.

While doing the detailed review, I think there are a couple of
problems with the patch, the main problem of storing all the xid in
the snap->subxip is that once we mark the snapshot overflown then the
XidInMVCCSnapshot, will not search the subxip array, instead it will
fetch the topXid and search in the snap->xip array.

I missed that you are marking the snapshot as takenDuringRecovery so
your fix looks fine.

Another issue is

that the total xids could be GetMaxSnapshotSubxidCount()
+GetMaxSnapshotXidCount().

I have fixed this, the updated patch is attached.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v3-0001-Allow-overflowed-snapshot-when-creating-logical-r.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Allow-overflowed-snapshot-when-creating-logical-r.patchDownload+66-13
#9Julien Rouhaud
rjuju123@gmail.com
In reply to: Dilip Kumar (#8)
Re: Error "initial slot snapshot too large" in create replication slot

Hi,

On Tue, Nov 02, 2021 at 04:40:39PM +0530, Dilip Kumar wrote:

I have fixed this, the updated patch is attached.

The cfbot reports that this patch doesn't compile:
https://cirrus-ci.com/task/5642000073490432?logs=build

[03:01:24.477] snapbuild.c: In function ‘SnapBuildInitialSnapshot’:
[03:01:24.477] snapbuild.c:587:2: error: ‘newsubxcnt’ undeclared (first use in this function); did you mean ‘newsubxip’?
[03:01:24.477] 587 | newsubxcnt = 0;
[03:01:24.477] | ^~~~~~~~~~
[03:01:24.477] | newsubxip
[03:01:24.477] snapbuild.c:587:2: note: each undeclared identifier is reported only once for each function it appears in
[03:01:24.477] snapbuild.c:535:8: warning: unused variable ‘maxxidcnt’ [-Wunused-variable]
[03:01:24.477] 535 | int maxxidcnt;
[03:01:24.477] | ^~~~~~~~~

Could you send a new version? In the meantime I will switch the patch to
Waiting on Author.

#10Dilip Kumar
dilipbalaut@gmail.com
In reply to: Julien Rouhaud (#9)
Re: Error "initial slot snapshot too large" in create replication slot

On Wed, Jan 12, 2022 at 4:09 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Hi,

On Tue, Nov 02, 2021 at 04:40:39PM +0530, Dilip Kumar wrote:

I have fixed this, the updated patch is attached.

The cfbot reports that this patch doesn't compile:
https://cirrus-ci.com/task/5642000073490432?logs=build

[03:01:24.477] snapbuild.c: In function ‘SnapBuildInitialSnapshot’:
[03:01:24.477] snapbuild.c:587:2: error: ‘newsubxcnt’ undeclared (first
use in this function); did you mean ‘newsubxip’?
[03:01:24.477] 587 | newsubxcnt = 0;
[03:01:24.477] | ^~~~~~~~~~
[03:01:24.477] | newsubxip
[03:01:24.477] snapbuild.c:587:2: note: each undeclared identifier is
reported only once for each function it appears in
[03:01:24.477] snapbuild.c:535:8: warning: unused variable ‘maxxidcnt’
[-Wunused-variable]
[03:01:24.477] 535 | int maxxidcnt;
[03:01:24.477] | ^~~~~~~~~

Could you send a new version? In the meantime I will switch the patch to
Waiting on Author.

Thanks for notifying, I will work on this and send the update patch soon.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#11Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Dilip Kumar (#8)
Re: Error "initial slot snapshot too large" in create replication slot

At Tue, 2 Nov 2021 16:40:39 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in

On Tue, Oct 19, 2021 at 2:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 12, 2021 at 11:30 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Oct 12, 2021 at 10:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 12 Oct 2021 13:59:59 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

So I came up with the attached version.

Sorry, it was losing a piece of change.

Yeah, at a high level this is on the idea I have in mind, I will do a
detailed review in a day or two. Thanks for working on this.

While doing the detailed review, I think there are a couple of
problems with the patch, the main problem of storing all the xid in
the snap->subxip is that once we mark the snapshot overflown then the
XidInMVCCSnapshot, will not search the subxip array, instead it will
fetch the topXid and search in the snap->xip array.

I missed that you are marking the snapshot as takenDuringRecovery so
your fix looks fine.

Another issue is

that the total xids could be GetMaxSnapshotSubxidCount()
+GetMaxSnapshotXidCount().

I have fixed this, the updated patch is attached.

Mmm. The size of the array cannot be larger than the numbers the
*Connt() functions return. Thus we cannot attach the oversized array
to ->subxip. (I don't recall clearly but that would lead to assertion
failure somewhere..)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#12Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Dilip Kumar (#8)
Re: Error "initial slot snapshot too large" in create replication slot

At Mon, 17 Jan 2022 09:27:14 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in

On Wed, Jan 12, 2022 at 4:09 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

The cfbot reports that this patch doesn't compile:
https://cirrus-ci.com/task/5642000073490432?logs=build

[03:01:24.477] snapbuild.c: In function ‘SnapBuildInitialSnapshot’:
[03:01:24.477] snapbuild.c:587:2: error: ‘newsubxcnt’ undeclared (first
use in this function); did you mean ‘newsubxip’?
[03:01:24.477] 587 | newsubxcnt = 0;
[03:01:24.477] | ^~~~~~~~~~
[03:01:24.477] | newsubxip
[03:01:24.477] snapbuild.c:587:2: note: each undeclared identifier is
reported only once for each function it appears in
[03:01:24.477] snapbuild.c:535:8: warning: unused variable ‘maxxidcnt’
[-Wunused-variable]
[03:01:24.477] 535 | int maxxidcnt;
[03:01:24.477] | ^~~~~~~~~

Could you send a new version? In the meantime I will switch the patch to
Waiting on Author.

Thanks for notifying, I will work on this and send the update patch soon.

me> Mmm. The size of the array cannot be larger than the numbers the
me> *Connt() functions return. Thus we cannot attach the oversized array
me> to ->subxip. (I don't recall clearly but that would lead to assertion
me> failure somewhere..)

Then, I fixed the v3 error and post v4.

To recap:

SnapBUildInitialSnapshot tries to store XIDS of both top and sub
transactions into snapshot->xip array but the array is easily
overflowed and CREATE_REPLICATOIN_SLOT command ends with an error.

To fix this, this patch is doing the following things.

- Use subxip array instead of xip array to allow us have larger array
for xids. So the snapshot is marked as takenDuringRecovery, which
is a kind of abuse but largely reduces the chance of getting
"initial slot snapshot too large" error.

- Still if subxip is overflowed, retry with excluding subtransactions
then set suboverflowed. This causes XidInMVCCSnapshot (finally)
scans over subxip array for targetted top-level xid.

We could take another way: make a !takenDuringRecovery snapshot by
using xip instead of subxip. It is cleaner but it has far larger
chance of needing to retry.

(renamed the patch since it represented a part of the patch)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v4-0001-Avoid-an-error-while-creating-logical-replication.patchtext/x-patch; charset=us-asciiDownload+63-13
#13Dilip Kumar
dilipbalaut@gmail.com
In reply to: Kyotaro Horiguchi (#12)
Re: Error "initial slot snapshot too large" in create replication slot

On Mon, Jan 31, 2022 at 11:50 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

SnapBUildInitialSnapshot tries to store XIDS of both top and sub
transactions into snapshot->xip array but the array is easily
overflowed and CREATE_REPLICATOIN_SLOT command ends with an error.

To fix this, this patch is doing the following things.

- Use subxip array instead of xip array to allow us have larger array
for xids. So the snapshot is marked as takenDuringRecovery, which
is a kind of abuse but largely reduces the chance of getting
"initial slot snapshot too large" error.

- Still if subxip is overflowed, retry with excluding subtransactions
then set suboverflowed. This causes XidInMVCCSnapshot (finally)
scans over subxip array for targetted top-level xid.

We could take another way: make a !takenDuringRecovery snapshot by
using xip instead of subxip. It is cleaner but it has far larger
chance of needing to retry.

(renamed the patch since it represented a part of the patch)

Thanks for the updated version. I will look into it this week.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#14Dilip Kumar
dilipbalaut@gmail.com
In reply to: Kyotaro Horiguchi (#12)
Re: Error "initial slot snapshot too large" in create replication slot

On Mon, Jan 31, 2022 at 11:50 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Mon, 17 Jan 2022 09:27:14 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in

me> Mmm. The size of the array cannot be larger than the numbers the
me> *Connt() functions return. Thus we cannot attach the oversized array
me> to ->subxip. (I don't recall clearly but that would lead to assertion
me> failure somewhere..)

Then, I fixed the v3 error and post v4.

Yeah you are right, SetTransactionSnapshot() has that assertion.
Anyway after looking again it appears that
GetMaxSnapshotSubxidCount is the correct size because this is
PGPROC_MAX_CACHED_SUBXIDS +1, i.e. it considers top transactions as
well so we don't need to add them separately.

SnapBUildInitialSnapshot tries to store XIDS of both top and sub
transactions into snapshot->xip array but the array is easily
overflowed and CREATE_REPLICATOIN_SLOT command ends with an error.

To fix this, this patch is doing the following things.

- Use subxip array instead of xip array to allow us have larger array
for xids. So the snapshot is marked as takenDuringRecovery, which
is a kind of abuse but largely reduces the chance of getting
"initial slot snapshot too large" error.

Right. I think the patch looks fine to me.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#15Yura Sokolov
y.sokolov@postgrespro.ru
In reply to: Dilip Kumar (#14)
Re: Error "initial slot snapshot too large" in create replication slot

В Пн, 07/02/2022 в 13:52 +0530, Dilip Kumar пишет:

On Mon, Jan 31, 2022 at 11:50 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Mon, 17 Jan 2022 09:27:14 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in

me> Mmm. The size of the array cannot be larger than the numbers the
me> *Connt() functions return. Thus we cannot attach the oversized array
me> to ->subxip. (I don't recall clearly but that would lead to assertion
me> failure somewhere..)

Then, I fixed the v3 error and post v4.

Yeah you are right, SetTransactionSnapshot() has that assertion.
Anyway after looking again it appears that
GetMaxSnapshotSubxidCount is the correct size because this is
PGPROC_MAX_CACHED_SUBXIDS +1, i.e. it considers top transactions as
well so we don't need to add them separately.

SnapBUildInitialSnapshot tries to store XIDS of both top and sub
transactions into snapshot->xip array but the array is easily
overflowed and CREATE_REPLICATOIN_SLOT command ends with an error.

To fix this, this patch is doing the following things.

- Use subxip array instead of xip array to allow us have larger array
for xids. So the snapshot is marked as takenDuringRecovery, which
is a kind of abuse but largely reduces the chance of getting
"initial slot snapshot too large" error.

Right. I think the patch looks fine to me.

Good day.

I've looked to the patch. Personally I'd prefer dynamically resize
xip array. But I think there is issue with upgrade if replica source
is upgraded before destination, right?

Concerning patch, I think more comments should be written about new
usage case for `takenDuringRecovery`. May be this field should be renamed
at all?

And there are checks for `takenDuringRecovery` in `heapgetpage` and
`heapam_scan_sample_next_tuple`. Are this checks affected by the change?
Neither the preceding discussion nor commit message answer me.

-------

regards

Yura Sokolov
Postgres Professional
y.sokolov@postgrespro.ru
funny.falcon@gmail.com

#16Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Yura Sokolov (#15)
Re: Error "initial slot snapshot too large" in create replication slot

At Sun, 13 Feb 2022 17:35:38 +0300, Yura Sokolov <y.sokolov@postgrespro.ru> wrote in

В Пн, 07/02/2022 в 13:52 +0530, Dilip Kumar пишет:

Right. I think the patch looks fine to me.

Good day.

I've looked to the patch. Personally I'd prefer dynamically resize
xip array. But I think there is issue with upgrade if replica source
is upgraded before destination, right?

I don't think it is relevant. I think we don't convey snapshot via
streaming replication. But I think that expanding xip or subxip is
wrong, since it is tied with ProcArray structure. (Even though we
abuse the arrays in some situations, like this).

Concerning patch, I think more comments should be written about new
usage case for `takenDuringRecovery`. May be this field should be renamed
at all?

I don't feel the need to rename it so much. It just signals that "this
snapshot is in the form for recovery". The more significant reason is
that I don't come up better name:p

And the comment is slightly modified and gets a pointer to detailed
comment.

+	 * Although this snapshot is not acutally taken during recovery, all XIDs
+	 * are stored in subxip. See GetSnapshotData for the details of that form
+	 * of snapshot.

And there are checks for `takenDuringRecovery` in `heapgetpage` and
`heapam_scan_sample_next_tuple`. Are this checks affected by the change?
Neither the preceding discussion nor commit message answer me.

The snapshot works correctly, but for the heapgetpage case, it foces
all_visible to be false. That unnecessarily prevents visibility check
from skipping.

An annoying thing in SnapBuildInitialSnapshot is that we don't know
the number of xids before looping over the xid range, and we don't
want to bother sorting top-level xids and subxids unless we have to do
so.

Is it better that we hassle in SnapBuildInitialSnapshot to create a
!takenDuringRecovery snapshot?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#17Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#16)
Re: Error "initial slot snapshot too large" in create replication slot

At Fri, 01 Apr 2022 14:44:30 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Sun, 13 Feb 2022 17:35:38 +0300, Yura Sokolov <y.sokolov@postgrespro.ru> wrote in

And there are checks for `takenDuringRecovery` in `heapgetpage` and
`heapam_scan_sample_next_tuple`. Are this checks affected by the change?
Neither the preceding discussion nor commit message answer me.

The snapshot works correctly, but for the heapgetpage case, it foces
all_visible to be false. That unnecessarily prevents visibility check
from skipping.

An annoying thing in SnapBuildInitialSnapshot is that we don't know
the number of xids before looping over the xid range, and we don't
want to bother sorting top-level xids and subxids unless we have to do
so.

Is it better that we hassle in SnapBuildInitialSnapshot to create a
!takenDuringRecovery snapshot?

So this is that. v5 creates a regular snapshot.

By the way, is there any chance this could be committed to 15?
Otherwise I'll immediately move this to the next CF.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v5-0001-Create-correct-snapshot-during-CREATE_REPLICATION.patchtext/x-patch; charset=us-asciiDownload+58-9
#18Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Kyotaro Horiguchi (#17)
Re: Error "initial slot snapshot too large" in create replication slot

On Thu, Mar 31, 2022 at 11:53 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

So this is that. v5 creates a regular snapshot.

This patch will need a quick rebase over 905c020bef9, which added
`extern` to several missing locations.

Thanks,
--Jacob

#19Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Jacob Champion (#18)
Re: Error "initial slot snapshot too large" in create replication slot

At Tue, 5 Jul 2022 11:32:42 -0700, Jacob Champion <jchampion@timescale.com> wrote in

On Thu, Mar 31, 2022 at 11:53 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

So this is that. v5 creates a regular snapshot.

This patch will need a quick rebase over 905c020bef9, which added
`extern` to several missing locations.

Thanks! Just rebased.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v6-0001-Create-correct-snapshot-during-CREATE_REPLICATION.patchtext/x-patch; charset=us-asciiDownload+58-9
#20Robert Haas
robertmhaas@gmail.com
In reply to: Kyotaro Horiguchi (#19)
Re: Error "initial slot snapshot too large" in create replication slot

On Mon, Jul 18, 2022 at 10:55 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

Thanks! Just rebased.

Hi,

I mentioned this patch to Andres in conversation, and he expressed a
concern that there might be no guarantee that we retain enough CLOG to
look up XIDs. Presumably this wouldn't be an issue when the snapshot
doesn't get marked suboverflowed, but what if it does?

Adding Andres in the hopes that he may comment further.

--
Robert Haas
EDB: http://www.enterprisedb.com

#21Andres Freund
andres@anarazel.de
In reply to: Kyotaro Horiguchi (#19)
#22Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#20)
#23Dilip Kumar
dilipbalaut@gmail.com
In reply to: Andres Freund (#21)
#24Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Dilip Kumar (#23)
#25Dilip Kumar
dilipbalaut@gmail.com
In reply to: Kyotaro Horiguchi (#24)
#26Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andres Freund (#21)
#27Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Dilip Kumar (#25)
#28Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#26)
#29Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#27)
#30Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#29)
#31Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#28)
#32Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#31)
#33Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#32)
#34Gregory Stark (as CFM)
stark.cfm@gmail.com
In reply to: Michael Paquier (#32)
#35Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Gregory Stark (as CFM) (#34)
#36Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#35)
#37Dilip Kumar
dilipbalaut@gmail.com
In reply to: Kyotaro Horiguchi (#36)
#38Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Dilip Kumar (#37)
#39Robert Haas
robertmhaas@gmail.com
In reply to: Kyotaro Horiguchi (#38)
#40vignesh C
vignesh21@gmail.com
In reply to: Robert Haas (#39)
#41Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Robert Haas (#39)
#42Robert Haas
robertmhaas@gmail.com
In reply to: Kyotaro Horiguchi (#41)
#43Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Robert Haas (#42)
#44Robert Haas
robertmhaas@gmail.com
In reply to: Kyotaro Horiguchi (#43)