can while loop in ClockSweepTick function be kind of infinite loop in some cases?

Started by 斯波隼斗over 3 years ago5 messageshackers
Jump to latest
#1斯波隼斗
shibahayaton@gmail.com

This question is about ClockSweepTick function and the code is below.
https://github.com/postgres/postgres/blob/24d2b2680a8d0e01b30ce8a41c4eb3b47aca5031/src/backend/storage/buffer/freelist.c#L146-L165

The value of expected, NBuffers, wrapped variable is fixed in the while
loop, so that when the value of expected variable is not equal to
StrategyControl->nextVictimBuffer, CAS operation fails and the while loop
will be run kind-of infinitely.
It is possible for this problem to occur when ClockSweepTick function is
concurrently called and nextVictimBuffer is incremented by other process
before CAS operation in the loop (ex: in this case, the value of expected
variable is NBuffers+1 while the value of nextVictimBuffer variable is
NBuffers+2. so CAS operation fails)
I think. `expected = originalVictim + 1;` line should be in while loop
(before acquiring spin lock) so that, even in the case above, expected
variable is incremented for each loop and CAS operation will be successful
at some point.
Is my understanding correct? If so, I will send PR for fixing this issue.

Thank you in advance
Hayato Shiba

#2Andres Freund
andres@anarazel.de
In reply to: 斯波隼斗 (#1)
Re: can while loop in ClockSweepTick function be kind of infinite loop in some cases?

Hi,

On 2023-01-11 01:25:06 +0900, 斯波隼斗 wrote:

This question is about ClockSweepTick function and the code is below.
https://github.com/postgres/postgres/blob/24d2b2680a8d0e01b30ce8a41c4eb3b47aca5031/src/backend/storage/buffer/freelist.c#L146-L165

The value of expected, NBuffers, wrapped variable is fixed in the while
loop, so that when the value of expected variable is not equal to
StrategyControl->nextVictimBuffer, CAS operation fails and the while loop
will be run kind-of infinitely.
It is possible for this problem to occur when ClockSweepTick function is
concurrently called and nextVictimBuffer is incremented by other process
before CAS operation in the loop (ex: in this case, the value of expected
variable is NBuffers+1 while the value of nextVictimBuffer variable is
NBuffers+2. so CAS operation fails)
I think. `expected = originalVictim + 1;` line should be in while loop
(before acquiring spin lock) so that, even in the case above, expected
variable is incremented for each loop and CAS operation will be successful
at some point.
Is my understanding correct? If so, I will send PR for fixing this issue.

Yes, I think your understanding might be correct. Interesting that this
apparently has never occurred.

Yes, please send a patch.

Thanks,

Andres

#3Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#2)
Re: can while loop in ClockSweepTick function be kind of infinite loop in some cases?

On Tue, Jan 10, 2023 at 12:40 PM Andres Freund <andres@anarazel.de> wrote:

I think. `expected = originalVictim + 1;` line should be in while loop
(before acquiring spin lock) so that, even in the case above, expected
variable is incremented for each loop and CAS operation will be successful
at some point.
Is my understanding correct? If so, I will send PR for fixing this issue.

Yes, I think your understanding might be correct. Interesting that this
apparently has never occurred.

Doesn't pg_atomic_compare_exchange_u32 set expected if it fails?

--
Robert Haas
EDB: http://www.enterprisedb.com

#4Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#3)
Re: can while loop in ClockSweepTick function be kind of infinite loop in some cases?

Hi,

On 2023-01-10 13:11:35 -0500, Robert Haas wrote:

On Tue, Jan 10, 2023 at 12:40 PM Andres Freund <andres@anarazel.de> wrote:

I think. `expected = originalVictim + 1;` line should be in while loop
(before acquiring spin lock) so that, even in the case above, expected
variable is incremented for each loop and CAS operation will be successful
at some point.
Is my understanding correct? If so, I will send PR for fixing this issue.

Yes, I think your understanding might be correct. Interesting that this
apparently has never occurred.

Doesn't pg_atomic_compare_exchange_u32 set expected if it fails?

Indeed, so there's no problem.

I wonder if we should make ->nextVictimBuffer a 64bit atomic. At the time the
changes went in we didn't (or rather, couldn't) rely on it, but these days we
could. I think with a 64bit number we could get rid of ->completePasses and
just infer it from ->nextVictimBuffer/NBuffers, removing th need for the
spinlock. It's very unlikely that 64bit would ever wrap, and even if, it'd
just be a small inaccuracy in the assumed number of passes. OTOH, it's
doubtful the overflow handling / the spinlock matters performance wise.

Greetings,

Andres Freund

#5斯波隼斗
shibahayaton@gmail.com
In reply to: Andres Freund (#4)
Re: can while loop in ClockSweepTick function be kind of infinite loop in some cases?

Hi, Thank you for your quick reply

I misunderstood the logic of pg_atomic_compare_exchange_u32, so the loop
cannot be infinite.

I wonder if we should make ->nextVictimBuffer a 64bit atomic. At the time

the changes went in we didn't (or rather, couldn't) rely on it, but these
days we could. I think with a 64bit number we could get rid of
->completePasses and just infer it from ->nextVictimBuffer/NBuffers,
removing th need for the spinlock. It's very unlikely that 64bit would
ever wrap, and even if, it'd just be a small inaccuracy in the assumed
number of passes. OTOH, it's doubtful the overflow handling / the spinlock
matters performance wise.

I'm not sure why 64 bit was not used at the time, so I'm concerned about
it.
but, except for it, you have a point and I completely agree with you. as
you have said, we should use 64 bit whose higher-order 32 bit indicates
completePasses, and should remove spinlock.
maybe we don't have to exceptionally worry about the overflow here mainly
because, even now, the completePasses can overflow and the possibility of
overflow may not be so different so that the 64 bit atomic operation is
better.

if overflow would happen, passes_delta variable in the function called by
bgwriter would be negative high value and it would lead to the failure of
assert. (the code is below
https://github.com/postgres/postgres/blob/d9d873bac67047cfacc9f5ef96ee488f2cb0f1c3/src/backend/storage/buffer/bufmgr.c#L2298-L2303

Do you send patch for the replacement with 64 bit? If you don't mind, I
would like to send patch. ( or is there some procedure before sending patch?

Thanks
hayato

2023年1月11日(水) 3:59 Andres Freund <andres@anarazel.de>:

Show quoted text

Hi,

On 2023-01-10 13:11:35 -0500, Robert Haas wrote:

On Tue, Jan 10, 2023 at 12:40 PM Andres Freund <andres@anarazel.de>

wrote:

I think. `expected = originalVictim + 1;` line should be in while

loop

(before acquiring spin lock) so that, even in the case above,

expected

variable is incremented for each loop and CAS operation will be

successful

at some point.
Is my understanding correct? If so, I will send PR for fixing this

issue.

Yes, I think your understanding might be correct. Interesting that this
apparently has never occurred.

Doesn't pg_atomic_compare_exchange_u32 set expected if it fails?

Indeed, so there's no problem.

I wonder if we should make ->nextVictimBuffer a 64bit atomic. At the time
the
changes went in we didn't (or rather, couldn't) rely on it, but these days
we
could. I think with a 64bit number we could get rid of ->completePasses
and
just infer it from ->nextVictimBuffer/NBuffers, removing th need for the
spinlock. It's very unlikely that 64bit would ever wrap, and even if, it'd
just be a small inaccuracy in the assumed number of passes. OTOH, it's
doubtful the overflow handling / the spinlock matters performance wise.

Greetings,

Andres Freund