heap_update is broken in current sources

Started by Tom Laneabout 25 years ago4 messages
#1Tom Lane
tgl@sss.pgh.pa.us

heap_update() currently ends with

if (newbuf != buffer)
{
LockBuffer(newbuf, BUFFER_LOCK_UNLOCK);
WriteBuffer(newbuf);
}
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
WriteBuffer(buffer);

/* invalidate caches */
RelationInvalidateHeapTuple(relation, &oldtup);
RelationMark4RollbackHeapTuple(relation, newtup);

return HeapTupleMayBeUpdated;

This is broken because WriteBuffer releases our refcounts on the buffer
pages that are holding the old and new tuples. By the time
RelationInvalidateHeapTuple gets to do its thing, some other backend may
have swapped a new disk page into the shared buffer that oldtup points
at. catcache.c will then be using the wrong data to compute the hash
index of the old tuple. This will at minimum result in failure to
invalidate the old tuple out of our catcache (because we'll be searching
the wrong hashchains), and can lead to a flat-out crash or Assert
failure due to invalid data being fed to the hashing code.

I have seen several nonrepeatable failures in the parallel regress tests
in recent weeks, which I now believe are all traceable to this error.

I will commit a fix for this error shortly, and have recommended to Marc
that he re-roll the beta2 tarball before announcing it...

regards, tom lane

#2The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#1)
Re: heap_update is broken in current sources

How are we on this?

On Sun, 7 Jan 2001, Tom Lane wrote:

heap_update() currently ends with

if (newbuf != buffer)
{
LockBuffer(newbuf, BUFFER_LOCK_UNLOCK);
WriteBuffer(newbuf);
}
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
WriteBuffer(buffer);

/* invalidate caches */
RelationInvalidateHeapTuple(relation, &oldtup);
RelationMark4RollbackHeapTuple(relation, newtup);

return HeapTupleMayBeUpdated;

This is broken because WriteBuffer releases our refcounts on the buffer
pages that are holding the old and new tuples. By the time
RelationInvalidateHeapTuple gets to do its thing, some other backend may
have swapped a new disk page into the shared buffer that oldtup points
at. catcache.c will then be using the wrong data to compute the hash
index of the old tuple. This will at minimum result in failure to
invalidate the old tuple out of our catcache (because we'll be searching
the wrong hashchains), and can lead to a flat-out crash or Assert
failure due to invalid data being fed to the hashing code.

I have seen several nonrepeatable failures in the parallel regress tests
in recent weeks, which I now believe are all traceable to this error.

I will commit a fix for this error shortly, and have recommended to Marc
that he re-roll the beta2 tarball before announcing it...

regards, tom lane

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: The Hermit Hacker (#2)
Re: heap_update is broken in current sources

The Hermit Hacker <scrappy@hub.org> writes:

How are we on this?

It's fixed.

I've also run the regress tests with bufmgr.c hacked up to discard
pages (with forcible overwriting) as soon as their refcount goes to
zero. That didn't disclose any similar bugs, although the coverage
of the tests leaves much to be desired :-(

regards, tom lane

#4The Hermit Hacker
scrappy@hub.org
In reply to: Tom Lane (#3)
Re: heap_update is broken in current sources

okay, will bundle up beta2 and announce it tonight when I get home ...
gives about 6 hrs or so to "halt the presses" *grin*

On Mon, 8 Jan 2001, Tom Lane wrote:

The Hermit Hacker <scrappy@hub.org> writes:

How are we on this?

It's fixed.

I've also run the regress tests with bufmgr.c hacked up to discard
pages (with forcible overwriting) as soon as their refcount goes to
zero. That didn't disclose any similar bugs, although the coverage
of the tests leaves much to be desired :-(

regards, tom lane

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org