Circular-freelist bug is still there

Started by Tom Laneabout 22 years ago4 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

I just saw the parallel regression tests hang up again. Inspection
revealed that StrategyInvalidateBuffer() was stuck in an infinite loop
because the freelist was circular.

(gdb) p StrategyControl->listFreeBuffers
$5 = 579
(gdb) p BufferDescriptors[579]
$6 = {bufNext = 106, data = 4991904, tag = {rnode = {tblNode = 17142,
relNode = 143947}, blockNum = 0}, buf_id = 579, flags = 14,
refcount = 0, io_in_progress_lock = 1179, cntx_lock = 1180,
cntxDirty = 0 '\000', wait_backend_id = 0}
(gdb) p BufferDescriptors[106]
$7 = {bufNext = 684, data = 1117088, tag = {rnode = {tblNode = 17142,
relNode = 143989}, blockNum = 0}, buf_id = 106, flags = 14,
refcount = 0, io_in_progress_lock = 233, cntx_lock = 234,
cntxDirty = 0 '\000', wait_backend_id = 0}
(gdb) p BufferDescriptors[684]
$8 = {bufNext = 579, data = 5852064, tag = {rnode = {tblNode = 17142,
relNode = 143929}, blockNum = 0}, buf_id = 684, flags = 14,
refcount = 0, io_in_progress_lock = 1389, cntx_lock = 1390,
cntxDirty = 0 '\000', wait_backend_id = 0}
(gdb)

Don't have time to chase it right now, but you should know that there's
still a low-probability bug in there.

regards, tom lane

#2Jan Wieck
JanWieck@Yahoo.com
In reply to: Tom Lane (#1)
Re: Circular-freelist bug is still there

Tom Lane wrote:

I just saw the parallel regression tests hang up again. Inspection
revealed that StrategyInvalidateBuffer() was stuck in an infinite loop
because the freelist was circular.

(gdb) p StrategyControl->listFreeBuffers
$5 = 579
(gdb) p BufferDescriptors[579]
$6 = {bufNext = 106, data = 4991904, tag = {rnode = {tblNode = 17142,
relNode = 143947}, blockNum = 0}, buf_id = 579, flags = 14,
refcount = 0, io_in_progress_lock = 1179, cntx_lock = 1180,
cntxDirty = 0 '\000', wait_backend_id = 0}
(gdb) p BufferDescriptors[106]
$7 = {bufNext = 684, data = 1117088, tag = {rnode = {tblNode = 17142,
relNode = 143989}, blockNum = 0}, buf_id = 106, flags = 14,
refcount = 0, io_in_progress_lock = 233, cntx_lock = 234,
cntxDirty = 0 '\000', wait_backend_id = 0}
(gdb) p BufferDescriptors[684]
$8 = {bufNext = 579, data = 5852064, tag = {rnode = {tblNode = 17142,
relNode = 143929}, blockNum = 0}, buf_id = 684, flags = 14,
refcount = 0, io_in_progress_lock = 1389, cntx_lock = 1390,
cntxDirty = 0 '\000', wait_backend_id = 0}
(gdb)

Don't have time to chase it right now, but you should know that there's
still a low-probability bug in there.

I was under the assumption Neil was still working on this. Don't recall
exactly why.

Anyhow, according to our discussion in early January I have changed the
code in StrategyInvalidateBuffer() so that it clears out the buffer tag
and the CDB's buffer tag. Also it will error out if the CDB is not found
at all.

The BM_FREE flag (meaning BM_UNPINNED effectively) is gone and replaced
with direct checks against the refcount.

Thanks for reminding,
Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Wieck (#2)
Re: Circular-freelist bug is still there

Jan Wieck <JanWieck@Yahoo.com> writes:

Tom Lane wrote:

I just saw the parallel regression tests hang up again.

Anyhow, according to our discussion in early January I have changed the
code in StrategyInvalidateBuffer() so that it clears out the buffer tag
and the CDB's buffer tag. Also it will error out if the CDB is not found
at all.

Oh, okay. So when's that fix going to be committed?

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#3)
Re: Circular-freelist bug is still there

Oh, okay. So when's that fix going to be committed?

Never mind, I see you just did ...

regards, tom lane