BUG #18385: Assert("strategy_delta >= 0") in BgBufferSync() fails due to race condition

Started by PG Bug reporting formabout 2 years ago5 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 18385
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 16.2
Operating system: Ubuntu 22.04
Description:

With a small shared_buffers value and a short bgwriter_delay:
shared_buffers = '1MB'
bgwriter_delay = 10

processing concurrent writing workload like:
pgbench -i
pgbench -t 10000 -c 40

on a slow machine leads to:
number of transactions actually processed: 103642/400000
...
tps = 187.796284 (without initial connection time)
pgbench: error: Run was aborted; the above results are incomplete.
...
TRAP: failed Assert("strategy_delta >= 0"), File: "bufmgr.c", Line: 2836,
PID: 20941
postgres: background writer (ExceptionalCondition+0x52)[0x5581a8dd1677]
postgres: background writer (BgBufferSync+0xb6)[0x5581a8c5b97a]
postgres: background writer (BackgroundWriterMain+0x20b)[0x5581a8bf117a]
postgres: background writer (AuxiliaryProcessMain+0x175)[0x5581a8befa29]
postgres: background writer (+0x423cff)[0x5581a8bf5cff]
postgres: background writer (PostmasterMain+0x1127)[0x5581a8bf916f]
postgres: background writer (main+0x227)[0x5581a8b1d4d5]

To ease reproduction, adding the following delay is recommended:
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -417,2 +417,3 @@ StrategySyncStart(...)
        }
+pg_usleep(300);
        SpinLockRelease(&StrategyControl->buffer_strategy_lock);

(Initially observed during the test 027_stream_regress (which runs with
shared_buffers = '1MB') with the minimal bgwriter_delay on a slow
dual-core machine, where one test run takes around 1000 sec.)

Reproduced on REL_12_STABLE .. master. In fact, the issue reproduced
starting from d72731a70.

#2Tender Wang
tndrwang@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #18385: Assert("strategy_delta >= 0") in BgBufferSync() fails due to race condition

Hi Alexander,
I haven't been able to reproduce this issue on my machine(2 vCPU, 2GB
memory).

Can you reproduce this issue reliably on your machine?

PG Bug reporting form <noreply@postgresql.org> 于2024年3月11日周一 03:08写道:

The following bug has been logged on the website:

Bug reference: 18385
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 16.2
Operating system: Ubuntu 22.04
Description:

With a small shared_buffers value and a short bgwriter_delay:
shared_buffers = '1MB'
bgwriter_delay = 10

processing concurrent writing workload like:
pgbench -i
pgbench -t 10000 -c 40

on a slow machine leads to:
number of transactions actually processed: 103642/400000
...
tps = 187.796284 (without initial connection time)
pgbench: error: Run was aborted; the above results are incomplete.
...
TRAP: failed Assert("strategy_delta >= 0"), File: "bufmgr.c", Line: 2836,
PID: 20941
postgres: background writer (ExceptionalCondition+0x52)[0x5581a8dd1677]
postgres: background writer (BgBufferSync+0xb6)[0x5581a8c5b97a]
postgres: background writer (BackgroundWriterMain+0x20b)[0x5581a8bf117a]
postgres: background writer (AuxiliaryProcessMain+0x175)[0x5581a8befa29]
postgres: background writer (+0x423cff)[0x5581a8bf5cff]
postgres: background writer (PostmasterMain+0x1127)[0x5581a8bf916f]
postgres: background writer (main+0x227)[0x5581a8b1d4d5]

To ease reproduction, adding the following delay is recommended:
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -417,2 +417,3 @@ StrategySyncStart(...)
}
+pg_usleep(300);
SpinLockRelease(&StrategyControl->buffer_strategy_lock);

(Initially observed during the test 027_stream_regress (which runs with
shared_buffers = '1MB') with the minimal bgwriter_delay on a slow
dual-core machine, where one test run takes around 1000 sec.)

Reproduced on REL_12_STABLE .. master. In fact, the issue reproduced
starting from d72731a70.

--
Tender Wang
OpenPie: https://en.openpie.com/

#3Alexander Lakhin
exclusion@gmail.com
In reply to: Tender Wang (#2)
Re: BUG #18385: Assert("strategy_delta >= 0") in BgBufferSync() fails due to race condition

Hi Tender Wang,

13.03.2024 09:50, Tender Wang wrote:

Hi Alexander,
   I haven't been able to reproduce this issue on my machine(2 vCPU, 2GB memory).

 Can you reproduce this issue reliably on your machine?

Thanks for your attention to this issue!

On my 12-core workstation, where pgbench shows approximately 2500 tps,
`pgbench -t 10000 -c 40` failed on iterations 1, 1, 3:
...
number of transactions actually processed: 398639/400000
number of failed transactions: 0 (0.000%)
latency average = 15.931 ms
initial connection time = 56.806 ms
tps = 2510.845701 (without initial connection time)
pgbench: error: Run was aborted; the above results are incomplete.
...

The server built with gcc,
CPPFLAGS="-O0" ./configure --enable-debug --enable-cassert ...

I have also intensified bgwriter as follows:
          rc = WaitLatch(MyLatch,
                         WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
-                       BgWriterDelay /* ms */ , WAIT_EVENT_BGWRITER_MAIN);
+                       1 /* ms */ , WAIT_EVENT_BGWRITER_MAIN);

That is, I have the attached modification applied and the following in my
extra.config:
fsync = off
autovacuum = off
shared_buffers = '1MB'

Best regards,
Alexander

Attachments:

debug-18385.patchtext/x-patch; charset=UTF-8; name=debug-18385.patchDownload+2-1
#4Tender Wang
tndrwang@gmail.com
In reply to: Alexander Lakhin (#3)
Re: BUG #18385: Assert("strategy_delta >= 0") in BgBufferSync() fails due to race condition

Alexander Lakhin <exclusion@gmail.com> 于2024年3月13日周三 16:00写道:

Hi Tender Wang,

13.03.2024 09:50, Tender Wang wrote:

Hi Alexander,
I haven't been able to reproduce this issue on my machine(2 vCPU, 2GB

memory).

Can you reproduce this issue reliably on your machine?

Thanks for your attention to this issue!

On my 12-core workstation, where pgbench shows approximately 2500 tps,
`pgbench -t 10000 -c 40` failed on iterations 1, 1, 3:
...
number of transactions actually processed: 398639/400000
number of failed transactions: 0 (0.000%)
latency average = 15.931 ms
initial connection time = 56.806 ms
tps = 2510.845701 (without initial connection time)
pgbench: error: Run was aborted; the above results are incomplete.
...

The server built with gcc,
CPPFLAGS="-O0" ./configure --enable-debug --enable-cassert ...

I have also intensified bgwriter as follows:
rc = WaitLatch(MyLatch,
WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
-                       BgWriterDelay /* ms */ , WAIT_EVENT_BGWRITER_MAIN);
+                       1 /* ms */ , WAIT_EVENT_BGWRITER_MAIN);

That is, I have the attached modification applied and the following in my
extra.config:
fsync = off
autovacuum = off
shared_buffers = '1MB'

Using your provided patch and configuration, I can reproduce this issue
easily on my mashine(2 vCPU, 2GB).
I went throug the StrategySyncStart() and ClockSweepTick() code. After
d72731a70 commit, operation on StrategyControl->nextVictimBuffer
doesn't need to get the StrategyControl->buffer_strategy_lock, but
StrategyControl->completePasses still needs lock.

Before d72731a70, if bgwriter gets the
StrategyControl->buffer_strategy_lock spinlock in StrategySyncStart(),
backends can't add StrategyControl->nextVictimBuffer,
then bgwriter can get a consistent value consisting of nextVictimBuffer
and completePasses.

But now, even though bgwriter gets the spinlock, backends also could add
the StrategyControl->nextVictimBuffer. In corner cases, bgwriter will see a
litter StrategyControl->nextVictimBuffer
value due to StrategyControl->nextVictimBuffer wraparound. But
StrategyControl->completePasses didn't update. So the
Assert(strategy_delta >= 0) will trigger.

I have two solutions in my head:
1. remove the Assert, but I'm not srue bgwriter write strategy should
change if strategy_delta < 0.
2. add more check in BgBufferSync() just like below:
/*
* Since nextVictimBuffer in StrategyControl has been atomic.
* So its operation would not need to get buffer_strategy_lock.
* In extreme circumstances, StrategySyncStart would see not consistent
* value consisting of nextVictimBuffer and completePasses.
* So we add one to passes_delta to make strategy_delta >= 0.
*/
if (passes_delta == 0 && strategy_delta < 0)
passes_delta++;

I test two times using the second solution, not trigger Assert again. Need
more times to test.

--
Tender Wang
OpenPie: https://en.openpie.com/

#5Tender Wang
tndrwang@gmail.com
In reply to: Tender Wang (#4)
Re: BUG #18385: Assert("strategy_delta >= 0") in BgBufferSync() fails due to race condition

Sorry, the second solution is still not fix the problem. I met Assert
failed:
strategy_passes is 100964, but prev_strategy_passes is 100965. The added
check was not satisfied.
Would it happen that current strategy_passes is less than
prev_strategy_passes.

Tender Wang <tndrwang@gmail.com> 于2024年3月15日周五 16:05写道:

Alexander Lakhin <exclusion@gmail.com> 于2024年3月13日周三 16:00写道:

Hi Tender Wang,

13.03.2024 09:50, Tender Wang wrote:

Hi Alexander,
I haven't been able to reproduce this issue on my machine(2 vCPU,

2GB memory).

Can you reproduce this issue reliably on your machine?

Thanks for your attention to this issue!

On my 12-core workstation, where pgbench shows approximately 2500 tps,
`pgbench -t 10000 -c 40` failed on iterations 1, 1, 3:
...
number of transactions actually processed: 398639/400000
number of failed transactions: 0 (0.000%)
latency average = 15.931 ms
initial connection time = 56.806 ms
tps = 2510.845701 (without initial connection time)
pgbench: error: Run was aborted; the above results are incomplete.
...

The server built with gcc,
CPPFLAGS="-O0" ./configure --enable-debug --enable-cassert ...

I have also intensified bgwriter as follows:
rc = WaitLatch(MyLatch,
WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
-                       BgWriterDelay /* ms */ ,
WAIT_EVENT_BGWRITER_MAIN);
+                       1 /* ms */ , WAIT_EVENT_BGWRITER_MAIN);

That is, I have the attached modification applied and the following in my
extra.config:
fsync = off
autovacuum = off
shared_buffers = '1MB'

Using your provided patch and configuration, I can reproduce this issue
easily on my mashine(2 vCPU, 2GB).
I went throug the StrategySyncStart() and ClockSweepTick() code. After
d72731a70 commit, operation on StrategyControl->nextVictimBuffer
doesn't need to get the StrategyControl->buffer_strategy_lock, but
StrategyControl->completePasses still needs lock.

Before d72731a70, if bgwriter gets the
StrategyControl->buffer_strategy_lock spinlock in StrategySyncStart(),
backends can't add StrategyControl->nextVictimBuffer,
then bgwriter can get a consistent value consisting of nextVictimBuffer
and completePasses.

But now, even though bgwriter gets the spinlock, backends also could add
the StrategyControl->nextVictimBuffer. In corner cases, bgwriter will see a
litter StrategyControl->nextVictimBuffer
value due to StrategyControl->nextVictimBuffer wraparound. But
StrategyControl->completePasses didn't update. So the
Assert(strategy_delta >= 0) will trigger.

I have two solutions in my head:
1. remove the Assert, but I'm not srue bgwriter write strategy should
change if strategy_delta < 0.
2. add more check in BgBufferSync() just like below:
/*
* Since nextVictimBuffer in StrategyControl has been atomic.
* So its operation would not need to get buffer_strategy_lock.
* In extreme circumstances, StrategySyncStart would see not consistent
* value consisting of nextVictimBuffer and completePasses.
* So we add one to passes_delta to make strategy_delta >= 0.
*/
if (passes_delta == 0 && strategy_delta < 0)
passes_delta++;

I test two times using the second solution, not trigger Assert again. Need
more times to test.

--
Tender Wang
OpenPie: https://en.openpie.com/

--
Tender Wang
OpenPie: https://en.openpie.com/