BUG #17903: There is a bug in the KeepLogSeg()
The following bug has been logged on the website:
Bug reference: 17903
Logged by: xu xingwang
Email address: xu.xw2008@163.com
PostgreSQL version: 13.3
Operating system: openEuler
Description:
Hi,
I found that KeepLogSeg() has a piece of code that is not correctly.
segno may be larger than currSegNo, since the slot_keep_segs variable is of
type "uint64", in this case the code "if (currSegNo - segno >
slot_keep_segs)" is incorrect.
"if (currSegNo - segno < keep_segs)" is also the same.
Checkpoint calls the KeepLogSeg function, and there are many operations
between recptr and XLogGetReplicationSlotMinimumLSN, including updating the
pg_control file, so segno may be larger than currSegNo.
KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo)
{
XLogSegNo currSegNo;
XLogSegNo segno;
XLogRecPtr keep;
XLByteToSeg(recptr, currSegNo, wal_segment_size);
segno = currSegNo;
/*
* Calculate how many segments are kept by slots first, adjusting for
* max_slot_wal_keep_size.
*/
keep = XLogGetReplicationSlotMinimumLSN();
if (keep != InvalidXLogRecPtr)
{
XLByteToSeg(keep, segno, wal_segment_size);
/* Cap by max_slot_wal_keep_size ... */
if (max_slot_wal_keep_size_mb >= 0)
{
uint64 slot_keep_segs;
slot_keep_segs =
ConvertToXSegs(max_slot_wal_keep_size_mb, wal_segment_size);
if (currSegNo - segno > slot_keep_segs)
segno = currSegNo - slot_keep_segs;
}
}
/* but, keep at least wal_keep_size if that's set */
if (wal_keep_size_mb > 0)
{
uint64 keep_segs;
keep_segs = ConvertToXSegs(wal_keep_size_mb, wal_segment_size);
if (currSegNo - segno < keep_segs)
{
/* avoid underflow, don't go below 1 */
if (currSegNo <= keep_segs)
segno = 1;
else
segno = currSegNo - keep_segs;
}
}
regards.
--
xu xingwang
At Wed, 19 Apr 2023 10:26:13 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in
I found that KeepLogSeg() has a piece of code that is not correctly.
segno may be larger than currSegNo, since the slot_keep_segs variable is of
type "uint64", in this case the code "if (currSegNo - segno >
slot_keep_segs)" is incorrect."if (currSegNo - segno < keep_segs)" is also the same.
Checkpoint calls the KeepLogSeg function, and there are many operations
between recptr and XLogGetReplicationSlotMinimumLSN, including updating the
pg_control file, so segno may be larger than currSegNo.
Correct. Thanks for the report.
If checkpointer somehow takes a long time between inserting a
checkpoint record and removing WAL files, while replication advances a
certain distnace, it can actually happen. Although that behavior
doesn't directly affect max_slot_wal_keep_size, it does disrupt the
effect of wal_keep_size.
The thinko was that we incorrectly assumed the slot minimum LSN can't
be larger than the checkpoint record LSN. We don't need to consider
max_slot_wal_keep_size if the slot minimum LSN is already larger than
currSegNo.
The attached fix works. However, I can't come up with a reasonable
testing script.
This dates back to 13, where max_slot_wal_keep_size was introduced.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-Fix-incorrect-calculation-regarding-max_slot_wal_kee.patchtext/x-patch; charset=us-asciiDownload+5-3
If `segno` can be larger than `currSegNo`, your patch seems to miss
the following branch,
are you missing this for some particular reason?
```
/* but, keep at least wal_keep_size if that's set */
if (wal_keep_size_mb > 0)
{
uint64 keep_segs;
keep_segs = ConvertToXSegs(wal_keep_size_mb, wal_segment_size);
if (currSegNo - segno < keep_segs) <<<< see here
{
/* avoid underflow, don't go below 1 */
if (currSegNo <= keep_segs)
segno = 1;
else
segno = currSegNo - keep_segs;
}
}
```
On Thu, Apr 20, 2023 at 11:04 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Wed, 19 Apr 2023 10:26:13 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in
I found that KeepLogSeg() has a piece of code that is not correctly.
segno may be larger than currSegNo, since the slot_keep_segs variable is of
type "uint64", in this case the code "if (currSegNo - segno >
slot_keep_segs)" is incorrect."if (currSegNo - segno < keep_segs)" is also the same.
Checkpoint calls the KeepLogSeg function, and there are many operations
between recptr and XLogGetReplicationSlotMinimumLSN, including updating the
pg_control file, so segno may be larger than currSegNo.Correct. Thanks for the report.
If checkpointer somehow takes a long time between inserting a
checkpoint record and removing WAL files, while replication advances a
certain distnace, it can actually happen. Although that behavior
doesn't directly affect max_slot_wal_keep_size, it does disrupt the
effect of wal_keep_size.The thinko was that we incorrectly assumed the slot minimum LSN can't
be larger than the checkpoint record LSN. We don't need to consider
max_slot_wal_keep_size if the slot minimum LSN is already larger than
currSegNo.The attached fix works. However, I can't come up with a reasonable
testing script.This dates back to 13, where max_slot_wal_keep_size was introduced.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Regards
Junwang Zhao
At Thu, 20 Apr 2023 15:40:14 +0800, Junwang Zhao <zhjwpku@gmail.com> wrote in
If `segno` can be larger than `currSegNo`, your patch seems to miss
the following branch,
are you missing this for some particular reason?
Oops! Sorry for the mistake and thanks for pointing it out.
I should have kept segno within the reasonable range.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
v2-0001-Fix-incorrect-calculation-regarding-max_slot_wal_.patchtext/x-patch; charset=us-asciiDownload+8-1
This patch looks reasonable.
+1
On Thu, Apr 20, 2023 at 4:58 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Thu, 20 Apr 2023 15:40:14 +0800, Junwang Zhao <zhjwpku@gmail.com> wrote in
If `segno` can be larger than `currSegNo`, your patch seems to miss
the following branch,
are you missing this for some particular reason?Oops! Sorry for the mistake and thanks for pointing it out.
I should have kept segno within the reasonable range.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Regards
Junwang Zhao
On Thu, Apr 20, 2023 at 05:58:14PM +0900, Kyotaro Horiguchi wrote:
+ /* + * Slot minimum LSN could be greater than recptr. In such cases, no + * segments are protected by slots but we still need to keep segno in a + * reasonable range for subsequent calculations to avoid underflow. + */ + if (segno > currSegNo) + segno = currSegNo; +
I wonder if it'd be better to instead change
if (currSegNo - segno > slot_keep_segs)
to
if (currSegNo > segno + slot_keep_segs)
and
if (currSegNo - segno < keep_segs)
to
if (currSegNo < segNo + keep_segs)
If segno > currSegNo, the first conditional would be false as expected, and
the second would be true as expected. We could also use
pg_sub_u64_overflow() to detect underflow, but that might be excessive in
this case.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
At Thu, 20 Apr 2023 15:02:42 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in
I wonder if it'd be better to instead change
if (currSegNo - segno > slot_keep_segs)
to
if (currSegNo > segno + slot_keep_segs)and
if (currSegNo - segno < keep_segs)
to
if (currSegNo < segNo + keep_segs)If segno > currSegNo, the first conditional would be false as expected, and
the second would be true as expected. We could also use
pg_sub_u64_overflow() to detect underflow, but that might be excessive in
this case.
From what I understand, the XLogSegNo calculations are designed
without considering the actual value range. Basically, it assumes that
(XLogSegNo + <any positive int>) can overflow. If we take the actual
value range into account, we can make that change.
The choice lies on whether we assume the actual value range or not.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Fri, Apr 21, 2023 at 10:42:31AM +0900, Kyotaro Horiguchi wrote:
From what I understand, the XLogSegNo calculations are designed
without considering the actual value range. Basically, it assumes that
(XLogSegNo + <any positive int>) can overflow. If we take the actual
value range into account, we can make that change.The choice lies on whether we assume the actual value range or not.
Yeah, after staring at this some more, I think your proposed fix is the way
to go. Alternatively, we could adjust the conditional for the
max_slot_wal_keep_size block to
if (keep != InvalidXLogRecPtr && keep < recptr)
but AFAICT that yields the exact same behavior.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Here is a new version of the patch. It is fundamentally the same as v2,
but I've adjusted the comment and commit message a bit. Barring
objections, I am planning to commit this (and back-patch to v13) in the
near future.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v3-0001-Prevent-underflow-in-KeepLogSeg.patchtext/x-diff; charset=us-asciiDownload+9-1
At Mon, 24 Apr 2023 12:14:52 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in
Here is a new version of the patch. It is fundamentally the same as v2,
but I've adjusted the comment and commit message a bit. Barring
objections, I am planning to commit this (and back-patch to v13) in the
near future.
When determining the oldest segment that must be kept for
replication slots, KeepLogSeg() might calculate a segment number
ahead of the one for the LSN given to the function. This causes
Maybe the KeepLogSeg() is a mistake of
XLogGetReplicationSlotMinimumLSN()?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Tue, Apr 25, 2023 at 01:14:52PM +0900, Kyotaro Horiguchi wrote:
At Mon, 24 Apr 2023 12:14:52 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in
When determining the oldest segment that must be kept for
replication slots, KeepLogSeg() might calculate a segment number
ahead of the one for the LSN given to the function. This causesMaybe the KeepLogSeg() is a mistake of
XLogGetReplicationSlotMinimumLSN()?
I adjusted the commit message to call out that function explicitly. Also,
I decided to go with the "keep < recptr" approach since there's no reason
to do anything in that block otherwise.
Thanks!
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Thu, Apr 27, 2023 at 02:47:46PM -0700, Nathan Bossart wrote:
I adjusted the commit message to call out that function explicitly. Also,
I decided to go with the "keep < recptr" approach since there's no reason
to do anything in that block otherwise.
b726236 exists in the tree, but it seems like the message of
pgsql-committers is held on moderation?
--
Michael
At Thu, 27 Apr 2023 14:47:46 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in
On Tue, Apr 25, 2023 at 01:14:52PM +0900, Kyotaro Horiguchi wrote:
At Mon, 24 Apr 2023 12:14:52 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in
When determining the oldest segment that must be kept for
replication slots, KeepLogSeg() might calculate a segment number
ahead of the one for the LSN given to the function. This causesMaybe the KeepLogSeg() is a mistake of
XLogGetReplicationSlotMinimumLSN()?I adjusted the commit message to call out that function explicitly. Also,
I decided to go with the "keep < recptr" approach since there's no reason
to do anything in that block otherwise.
That works, too. Thanks!
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center